A brief explanation of the annotation format: ------------------------------------------------------------------- [document] 22878 [sentence] click start , point to settings , and then click control panel [textaction]  wc|0|click  wu|1|start  wp|-1|(none) [worldaction]  c|-1|left click  u|131132-131130-w--1-51|start|Button  -1wbuttonstartwshell_traywnd [optional]  0 [command]  !131132-131130-w--1-51:left click [hla] [textaction]  wc|3|point  wu|5|settings  wp|-1|(none) [worldaction]  c|-1|left click  u|65688-0-t-3-11|Settings|Button  3tbuttonsettingswtoolbarwindow32wmenusitewbasebar [optional]  0 [command]  !65688-0-t-3-11:left click [hla] [textaction]  wc|9|click  wu|10|control panel  wp|-1|(none) [worldaction]  c|-1|left click  u|131288-0-t-0-3|Control Panel|Button  0tbuttoncontrol panelwtoolbarwindow32wmenusitewbasebar [optional]  0 [command]  !131288-0-t-0-3:left click [hla] [sentence] double-click power options [textaction]  wc|11|double-click  wu|12|power options  wp|-1|(none) [worldaction]  c|-1|double click  u|262342-0-l-16-85|Power Options|Item  16litempower optionswsyslistview32watl shell embeddingwinternet explorer_serverwshelldll_defviewwcabinetwclasscontrol panel [optional]  0 [command]  !262342-0-l-16-85:double click ------------------------------------------------------------------- In the above example snippet: 1. Lines starting with '[document]' contain the document id. The document id corresponds to the id in the Windows 2000 dataset. The format of such lines is: '[document] ' 2. Lines starting with '[sentence]' contain the sentences from the document. All word indices (see below) are zero-based indices into this line. 3. Lines containing the tags [textaction], [worldaction], [optional], [command], and optionally [hla] will occur in blocks following a [sentence] line. Each such block specifies the attributes of an action performed (or one that should be performed) on the target environment, and on the text by the learner being evaluated. Each of these tags is described below. 4. Lines starting with '[textaction]' identify the selected word indices, and are of the following format: '[textaction] [0x01] wc|| [0x01] wu|| [0x01] wp||' Here '[0x01]' is the hex control character, and a parameter word index of -1 indicates that the parameter word is not used by the selected command. 5. Lines starting with '[worldaction]' contain the command, object id, object type and object name in the following format: "[worldaction] [0x01] c|-1| [0x01] u||| [0x01] " The is guaranteed to be unique only within the current observed world state. If the same world state is observed again at a different point in time, a given object id may refer to a completely different object (The object id is composed of the GUI element's Win32 HWND). The attempts to uniquely identify objects across multiple observations of the same state. While successful in the vast majority of cases, is also not always guaranteed to be unique :-/ 6. Lines starting with '[command]' specify the actual command that was sent to the target environment. These lines have the following format: "[command] " 7. Lines starting with '[optional]' indicate whether the corresponding [command] is optional or not. A line containing '[optional] [0x01] 1' indicates an optional command, and '[optional] [0x01] 0' indicates a mandatory command. 8. A line containing the tag '[hla]' indicates that the corresponding [command] was produced by a high-level instruction.