bottom of page  |  cpt index

CPT Crosswords

1. Introduction
    1.1 System Requirements  1.2 Features
    1.3 Versions
2. Browse Tab
    2.1 Choose Library Dialog  2.2 New Library Dialog
    2.3 Folder Filters Dialog
3. Source Tab
    3.1 Operations  3.2 Filters Dialog
4. Base Tab
    4.1 Words Mode  4.2 Clues Mode
5. Target Tab
    5.1 Operations  5.2 Diagrams Mode  5.3 Words Mode  5.4 Clues Mode
6. How To
    6.1 Generate Diagrams   6.2 Generate Crosswords  6.3 Make New Base Words
   6.4 Make New Base Clues
Appendix A: File Formats
    A.1 Text Dictionary  A.2 Tags

Appendix B: Language Support
1. Introduction

CPT Crosswords is multilingual crossword compiler. This program includes the modules CPT Diagrams, CPT Words, CPT Clues, CPT Editor, and part of dictionary tools from the CPT kit.

CPT Editor supports all steps of the crossword creation from scratch to the printout. It is described in separate document.

CPT Diagrams is a generator, which creates diagrams in several styles.

CPT Words is a generator, which fills the diagrams with words.

CPT Clues is a simple clue generator, which adds clues and other information to form the complete crossword.

Before reading this document you should browse "CPT - The Primer" for the basic notions.

1.1 System Requirements

CPT Crosswords is available for MS Windows and Linux on PCs. Since it is Java program, some JVM (Java Virtual Machine or Java SDK/JRE) should be pre-installed. The program is adjusted to work with Sun's Java 1.1, 1.2, 1.3, and 1.4 on Windows and Linux. The MS JVM (jview) could be used as well (nice GUI speed, Java 1.1.4 'features' and very small number of Java encoding converters).

The installations requires 3.5 MB disk space and 128 MB RAM is recommended.

1.2. Features

1.3 Versions

The version of the program documented here is 1.0.
The pictures are from the Windows version, but the text covers the Linux version as well.
The versions of the modules used by this program are as follows: CPT Editor - 1.2, CPT Diagrams - 1.2, CPT Words - 1.1, CPT Clues - 1.0.

 

2. Browse Tab

After starting the program, you will see the following:

In Folder and Folder Type you should set the folder where to start the CPT Editor from this tab. The folder types are:

To set the path of the current folder type click on 'Set Folder' button.

If the folder type is Files or Library, you can use the 'Folder Filters' button to start Folder Filters dialog where you can set filters for the files, which will be shown in CPT Editor.

Search In globally defines the library where to search when you click on 'Search for same items' button in Source/Target tab.

Save In globally defines the library where to save the crossword set when you click on 'Save this set' button in Source/Target tab.

The saving for the Base tab is in Base Words/Clues folder.

Via Diagrams, Words, and Clues radio buttons you can select the current mode of Source, Base and Target tabs. Diagrams will force the CPT Diagrams generator, Words will force the CPT Words generator, and Clues will force the CPT Clues generator.

The 'Start' button will start the CPT Editor with the current selected folder.

The 'OK, save' button will save all current settings.

The 'Dismiss' button will stop the program.

2.1 Choose Library Dialog

This dialog will be shown when you start search or save operation from the Source or Target tab and the global setting is 'Choose Library'.

Note that the list of library files is filtered by the size but you should take care about data formats. If you click on 'Dismiss' button, the operation will be canceled. To proceed with the operation, select a file from the list and click on OK button.

2.2 New Library Dialog

This dialog is used to create new library when you click on 'Save this set' button in Source/Target tab and the global setting in Save In is 'New Library' in Browse tab.

If you click on 'Dismiss' button, the save operation will be canceled. To proceed with the save operation, click on 'OK, save' button (after setting the parameters).

Name Tab

The Dimensions fields are disabled and show the size of the current Source/Target set.

You can use the Data Format to force a new type of the library (with implied conversions and possible lost of data). Via RTL Numbers you can force conversion to RTL library (new format Grids+ and as input we have non-RTL diagrams).

In Name field you should enter the name of the file. Do not change the extension of the file - it will be set by the program depending on the data format.

Flags Tab

Use Style and Stage to set the global flags for this library. These flags are used for searching.

Encoding Tab

If the set you are saving contains text data, you should choose the proper encoding and locale (different from 'Default'). If the selected encoding is not the same as that of the input set, a recoding of all data will be done. Note that the user defined 8-bit converter can be set in CPT Editor.

2.3 Folder Filters Dialog

This dialog has the same layout as New Library dialog and is not shown here.

If the folder type is Library, in Dimensions fields you can enter fixed size or -1 to ignore the size filter. You can set also the Data Format and all other options as filters.

In Name field you can enter a regular expression for the file name. If the folder type is Files, only this field will be enabled.

 

3. Source Tab

The Source is a temporary crossword set, which is the result from the select operation and is the input source for the Target tab 'generate' operation. The program maintains one working file for all CPT modes.

3.1 Operations

Here you can define all options and start an operation for the current Source:

Dimensions text fields define the selected size (columns by rows). The fields are ignored when the Source is empty and the size is defined by the input for the select operation.

Selection field shows the number of items in the current Source.

'Filters for selection' button will start the Source Selection Filters dialog.

Select From defines the input for the select operation. Note that the selection from 'Target' is CPT Mode sensitive. For example, if current CPT Mode is Diagrams, the working target file for mode Diagrams is the current Target.

Data Format defines the forced data format for the select operation. If the Source is empty, you can select one of the data formats, which could impose a conversion.

The 'Start' button will start the select operation if Select check box is checked and/or will start the Editor if Show check box is checked,

Use Filters if checked, means to use the defined filters for the select operation.

Append if checked, means to append the result from the select operation to the current Source.

The 'Search for same items' button will start the search operation according to the global Search In settings from the Browse tab. All items from the Source set which are found in the search set will be marked as deleted and you should start the Editor to perform the deletion or to remove the marks.

The 'Save this set' button will start the save operation according to the global Save In settings from the Browse tab.

The 'Delete this set' button will start the delete operation - the current Source will be removed. You might need to delete the current Source in order to do a proper select operation (no append, format by selection) from the Folder Browser window in CPT Editor.

3.2 Filters Dialog

When you click on 'Filters for selection' button, the Source Selection Filters dialog will appear. The filters define the diagram properties, which will be checked during the selection.

Min/Max Words defines the minimum/maximum number of words the diagram should contain. The value of -1 means, that the filter is ignored.

Min/Max Word Length define the minimum/maximum length of words the diagram should contain. The value of -1 means, that the filter is ignored.

Min/Max % Blacks define the minimum/maximum percent black cells the diagram should contain. The value of -1.0 means, that the filter is ignored.

Min/Max % Unches define the minimum/maximum percent unchecked letters the diagram should contain. The value of -1.0 means, that the filter is ignored.

Max Black Pattern defines the maximum number of sequential blacks in row/column the diagram should contain. The value of -1 means, that the filter is ignored.

Max Items define the maximum number of items in the result of the select operation. The value of -1 means, that the filter is ignored.

User Data defines the value of user data field the diagram should contain. The value of -1 means, that the filter is ignored.

Style defines the style of the diagram.

Check Style forces the check of the defined style of the diagram.

Has 4x4 White, Has 4x5 White, and Has 2x2 Black will force the checking of the corresponding rectangle of whites/blacks.

All of the remaining check boxed will force the checking of the corresponding property of the diagram: Convex, Standard Symmetry, Horizontal Symmetry, Vertical Symmetry, Barred, Unused Cells, Has Reversed, Has Data (has letters in the grid or has clues), Cells (the set is marked for the diagram generator), Marked (the set is marked by the diagram generator), User Flag (the set is marked by the user).

 

4. Base Tab

The Base is a word list (named Base Words for Words mode) or dictionary with clues (named Base Clues for Clues mode) used by the generators. Technically, it is a binary CTree file, a format supported by all CPT programs.

The Base tab contains all operations for the maintaining of the Base Words/Clues. These operations are creation, adding, deleting and saving. To change the current file, you should delete it first, and then use the Browse Folder window in Editor to select the new file from the proper Base Words/Clues folder. The tab will be disabled when the CPT Mode is Diagrams.

4.1 Words Mode

When in Browse tab the CPT Mode is Words, you are working with the CPT Words generator and you will see the following:

The Words field shows the number of words in the current Base Words. If it is 0, the current file is deleted.

The button 'Set New Base Words Encoding' will start the Base Words Encoding dialog, where you should set the encoding and the locale when you are creating new Base Words.

Select CTree Dictionary if checked, will force the input from the selected CTree. Via 'Set CTree File' button you should select the path of the file. 'Filters for selection' button will start the Tags As Filters dialog where you can set the filters for selection of words from the CTree.

Select Text Word List if checked, will force the input from the selected text file containing word list. Via 'Set Word List File' button you should select the path of the file. The 'Text Word List Encoding' button will start the Word List Encoding dialog where you should set the encoding and the locale of the file. The format of the file is just a word per line.

When Show check box is checked and you click on 'Start' button, the Base Words dialog window containing the CTree header will be shown. If the word list is in Unicode with many characters used, it could take some time to prepare and format the complete information.

When Select check box is checked and you click on 'Start' button, the process of creating new Base Words will be started. The Messages window, showing all messages of the creation, will appear.

If Use Filters is checked, the creation process will do the selection of words according to the filters (valid when the input is CTree file). If Append is checked, the program will add the selected input to the current Base Words.

During the creation process the words are converted to 'crossword form' (lower case and all non-letter characters are ignored).

When you click on 'Save' button, the Save dialog will be started in the current Base Words folder. You can save the file under any name, but you should not change the default extension. It is recommended to do the save operation after any new Base Words creation.

If you click on 'Delete' button, the current file will be deleted.

4.2 Clues Mode

When in Browse tab the CPT Mode is Clues, you are working with the CPT Clues generator and you will see the following:

The Clue Words field shows the number of words in the current Base Clues. If it is 0, the current file is deleted.

The button 'Set New Base Clues Encoding' will start the Base Clues Encoding dialog, where you should set the encoding and the locale when you are creating new Base Clues.

Select CTree Dictionary if checked, will force the input from the selected CTree. Via 'Set CTree File' button you should select the path of the file. 'Filters for selection' button will start the Tags As Filters dialog where you can set the filters for selection of words and associated clues from the CTree.

Select Text Dictionary if checked, will force the input from the selected text file containing words with clues. Via 'Set Text Dictionary File' button you should select the path of the file. The 'Text Dictionary Encoding' button will start the Text Dictionary Encoding dialog where you should set the encoding and the locale of the file. The format of the file is described in the appendix.

When Show check box is checked and you click on 'Start' button, the Base Clues dialog containing the CTree header will be shown.

When Select check box is checked and you click on 'Start' button, the process of creating new Base Clues will be started. The Messages window, showing all messages of the creation, will appear.

If Use Filters is checked, the creation process will do the selection of words according to the filters (valid when the input is CTree file). If Append is checked, the program will add the selected input to the current Base Clues.

During the creation process the words are converted to 'crossword form' (lower case and all non-letter characters are ignored). The clues data is not changed (the only exception is if recoding has been selected). The tags data is set according to current locale Tags file. The format of the Tags file is described in the appendix. Full details of creation of CTree with clues you can find in the documentation of CPT Word Lists program.

When you click on 'Save' button, the Save dialog will be started in the current Base Clues folder. You can save the file under any name, but you should not change the default extension. It is recommended to do the save operation after any new Base Clues creation.

If you click on 'Delete' button, the current file will be deleted.

 

5. Target Tab

The Target is a temporary crossword set, which is the result from the generation process. The program maintains three working files: one for CPT Diagrams generator, one for CPT Words generator, and one for CPT Clues generator.

In Target Tab you can set the options, run the generators, and you can do all operations over the Target set. These operations are: creation (via the generators), browsing/editing (via the Editor), searching, saving, and deleting.

5.1 Operations

The 'Start' button will start the selected generator if Run check box is checked and/or will start the Editor if Show check box is checked.

The 'Search for same items' button will start the search operation according to the global Search In settings from the Browse tab. All items from the Target set, which are found in the search set will be marked as deleted, and you should start the Editor to perform the deletion or to remove the marks.

The 'Save this set' button will start the save operation according to the global Save In settings from the Browse tab.

The 'Delete this set' button will start the delete operation - the current Target will be removed.

5.2 Diagrams Mode

When in Browse tab the CPT Mode is Diagrams, you are working with the CPT Diagrams generator, and you will see the following:

The program uses the classical approach of 'generate and test'. The 'generate' part supplies B&W diagram and the 'test' part checks the conditions imposed by the user. If the diagram matches the requirements, it is written into the Target file. The variations are in the 'generate' part. The radio buttons No Source, Catenae, and Transform can be used to select one of the generation algorithm modes.

No Source. In this mode the program generates some combinations of blacks and whites depending of the algorithm number selected (for Algorithm 1 - all possible variants, see below). For diagrams of normal size this is the preferred generation mode.

Catenae. This is the engine for big diagrams. It takes diagrams from the Source file and assembles them into bigger ones. The process is some sort of catenation of building blocks and here comes the name. The generator produces B&W diagrams and takes as input only 'Diagrams' data format. Usually, the preferred input size is twice smaller than the target one. For example, to generate 8x8, the optimum input size is 4x4, but you can use other sizes as well. In the case of NY Times style with wrapper and Standard symmetry used for generation, the preferred input size is the half of the middle part. For example, for target size 15x15 the half of the middle part is 9x5 and this should be the input size.

Transform. The source diagrams are converted to target ones according to specified transformation operations (flipping, rotations and shifting). The output diagrams have the same size as the input ones with the exception when the operation is rotation and the sizes allow this operation.

In Dimensions text fields you should set the desired size of the diagrams. The field Result shows the number items in the current Target. When the check box Run is set and you press 'Start' button the generation will start. When the Show is set, but not Run, and you press 'Start' button, the Editor will be started with the current Target set. During the generation 'Pause' and 'Stop' buttons will appear on the button bar. Use them to pause or stop the generation process.

With Auto Source combo box you can ask the program to make the selection of the Source for you. The steps of this operation are as follows: according to the selected Target dimensions the software will look for libraries having proper dimensions as building blocks. If nothing is found it will show a message like "The proper 8x4 not found" and will stop. If something is found, the generation will start.

Generation Filters Dialog

'Filters for generation' button will start a dialog for the filters. The filters are the conditions you have to supply for the 'test' part of the generation. The dialog is almost the same as this one for Source selection filters and just the differences will be described here. In several check boxes the word "No" replaces "Has" and some check boxes are replaced with other ones supported only by the diagram generator. The check box Mark Used when set will force the generator to mark the used diagrams in the Source file. If the check box Cells is set, the generator will switch to mode for generating building blocks. In this mode some restrictions will be relaxed (e.g. the min length of words on the boundaries will be ignored, Convex property will be ignored) and the Target header flag Cells will be set. Black Grid will be added in Style list and it is supported for square diagrams and in No Source mode only.

Algorithm Options Dialog

The 'Generation algorithm options' button will start the dialog where you can choose the parameters for the 'generate' part of the generation.

Show Generation means to show a window displaying the generated diagrams. After finishing the generation, you have to close this window because the program will not do this.

Sort Patterns is used to obtain different diagrams in No Source algorithm mode. If not checked, the patterns are used in ascending binary order - this is the recommended variant to start with. When checked, you should define the desired sorting. If Descending is not checked, the order is ascending. Binary means that the patterns are considered as binary numbers, which will define the order. Blacks mean that the number of blacks in the pattern will define the order. Random means that random numbers assigned to any pattern will define the order. When Random is checked, you can set the Seed value to -1 (current time is used) or to some positive integer value. The practice shows that via reordering of the patterns you can get results faster. For example, sizes greater than 17 can be obtained in seconds using random sorting.

The radio buttons from Algorithm 1 to Algorithm 4 are used to define how the generator should make the combinations. The first one will try all possible combinations and with increasing the number of algorithm the combinations will be less and less. Our advice is to start always with number 4 and then to try the others.

If No Source algorithm mode has been selected, the meanings are as follows:
Algorithm 1: unconditionally all possible variants.
Algorithm 2 - 4: mainly variants of symmetric diagrams - just the upper top-left quarter is generated and this part is used as building block using transformations for the final diagrams. You can obtain symmetric diagrams via Algorithm 1 as well, but here the variants are less and it is much faster. When Black Grid style is selected in filters, the program will slightly relax the requirements in order to obtain symmetric diagrams, but will not do this for Algorithm 1.
The style Scandy is supported in this mode as well but the required time is reasonable for small sizes. You could use Catenae mode for this style with proper Source. And a reminder: the program has no built in knowledge for all impossible variants, for example, there are no symmetric square Scandy diagrams of even sizes but the generation will start and will result in 0 after 5 seconds or after 5 hours.
When Clues In style is selected the program uses a time limit of 10 seconds for the 'hard' clue allocation algorithm in order to prevent blocking of the generation.

If Catenae mode is selected you can use Style Wrapper and Use StdSymm check boxes.

Style Wrapper if checked will force the generator to put the proper wrapper for the selected style from filters dialog. The wrapper for Scandy style is just the top row and the left column. The wrapper for NY Times style is of size columns x 3 on top and bottom, and of size 3 x rows on left and right. The generated wrapper variants participate as building blocks in the generation. The other styles have no wrappers.

Use StdSymm if checked will instruct the generator to use standard symmetry. In this mode the upper half only of the diagram will be generated and the lower half will be obtained by the rules of the symmetry.

In Catenae and Transform modes you can choose some of the transformation operations below. They will be performed on the building blocks and this is a way to increase the possible variants for finding the desired diagrams.

The Flip operations include flipping on both diagonals and on vertical and horizontal axes. The rotation operations include 90 degrees clock-wise, 180 degrees clock-wise, and 270 degrees clock-wise. Shift Horizontal will do all possible shifts left and right with carry and Shift Vertical will do all possible shifts down and up with carry. In the fields From Position you can define the starting position for the shifting operations.


5.3 Words Mode

When in Browse tab the CPT Mode is Words, you are working with the CPT Words generator, and you will see the following:

Dimensions fields show the current size (columns by rows).
The Target properties (dimensions, diagram style, ...) are predefined by the current Source. The locale and encoding are predefined by the Base Words.

The Result field shows the number of items in the current set.

'Generation algorithm options' button will start the Algorithm Options dialog.

Exclude Words if checked, will force the generator to skip the words from the given word list when reading the Base Words. Via 'Set Exclusion Word List File' button you should set the file path. 'Text Word List Encoding' button will start the Word List Encoding dialog where you can set the encoding and locale of the file.

Algorithm Modes

Default - the program will choose the proper mode according to the diagram properties.

Cells - the default mode for diagrams having no blacks between words. It is designed mainly for so called double word squares. On any step of the generation the program selects just one letter and puts it into the cell.

Words - on any step the program selects a word and puts it into the word slot.

Mixed is a combination of the previous two.

Mixed+ is an enhancement of the previous one - the program builds additional tables and uses more memory. This is the default mode for most of the diagrams.

Unicode is supported by Words and Mixed+ modes. If you have selected Default or unsupported mode, the program will choose Mixed+.

Generally, the Mixed+ mode is the fastest one, but for some particular cases this is not true. Mixed mode could be faster for simple diagrams, Words mode could be faster for Unicode, and Cells mode is the fastest for double word squares.

During the generation, in the place of Search, Save, and Delete buttons two other buttons will appear: 'Pause' is used to pause the generation, and 'Stop' is used to cancel the generator. In pause mode the 'Start' button is used to continue the paused process.

If you have checked Show Generation in Algorithm Options dialog, a window showing current status of the grid will appear.

This is a sample of the final form of the window of Mixed+ mode generation with Source containing 10 different diagrams and the Max Variants per source diagram is set to 5 (all settings are as in the picture below). The title shows that the last grid was generated in 2 seconds, and the total time for all of the 50 grids is about 1 minute. The status bar contains the properties of the current diagram and the actual number of the words used during the generation of the current grid (here the Base Words is Slovenian word list with over one million words).

When the generation process is finished, the number of generated items will be set in the Result field. If the generation fails, an error message will be shown. The message 'Out of words' means that the generator is not able to find a solution using the current Base Words and the current algorithm options.

Algorithm Options Dialog

Via the radio buttons Algorithm 1 to Algorithm 4 you should select one of the built in algorithms of the generator. They differ in the way the search list is built, and in the backtracking process. We have to note that these algorithm numbers are actually modifications of the selected algorithm mode. Only Algorithm 3 is 'sound' - it will find a solution, if at least one exists, but in most cases, it is the slowest one. If you want to find all possible solutions, use Algorithm 3, Max Variants = big number, and Max Common Words = -2 (see below). The other algorithms use 'unsound' techniques and heuristics in order to find quick solution. All options marked as 'unsound' are ignored when Algorithm 3 is selected.

Keep Source means that if there are preset letters in the grid, the generator should not delete these letters during the backtracking. Usually, you should uncheck it when the Source is unfinished result from a previous generation, otherwise, you will see immediately the message 'Out of words' (if you have not increased the size of Base Words).

Local Backtrack is one of the 'unsound' techniques. The words chosen using this technique will be drawn in blue color.

Max Backtracks is another 'unsound' parameter. The value of -1 is the default - the generator will choose some small number according to the algorithm mode. For algorithm mode Words the default value is 5 and for other modes it is 100. The value 0 is a special case - the generator will make a jump backtrack on the starting word/cell when a multiple repeated backtrack is detected. The other values will define the maximum number repeated backtracks the generator should allow. If you set a big number (e.g. more than the number of Base Words), this parameter becomes 'sound'.

Sort Base Words means to order the lists of word/letter candidates in particular order. If Descending is checked, the sorting order is descending, otherwise, it is ascending. Letter Frequency means that the order is defined by the normalized letter frequency of the word or by the letter frequency in the dictionary. Note that, if you want the words having bigger letter frequency to be in the front, Descending should be checked as well. Binary denotes binary sorting of the words/letters. Random means that a random number generator will be used to assign random number to any word/letter and the sorting will be according to that number. Seed is the initial value for the random generator. The value -1 means "take it from the computer clock". You can set other fixed value in order to get a fixed sequence from the random generator.

Max Variants defines the number of target grids, which will be generated per source diagram. If this value is greater than 1, after finishing the current generation, the result is saved, the generator backtracks, and starts another variant using the same source.

Max Common Words defines the maximum number of common words of the variants, it is ignored if Max Variants is set to 1. When the value is negative (-1 or -2) the generator will not delete any words from the Base Words and will generate next variants using all available words. If the value is -1, it will backtrack on the starting word/cell. If the value is -2, it will backtrack on the last word/cell (this is used to find all possible variants with 'sound' search). Any other value will define the maximum common words the consecutive variants could contain and the backtracking is on the starting word. For example, the value of 0 means that all words used in the previous variants will be deleted from the Base Words. Note that this parameter affects only the variants obtained from one source diagram.

The Start Position group defines the starting word in the search list. By Algorithm leaves this task to the generator. Random forces the generator to choose randomly the starting word. Choose is the user choice - in the text fields you should set the column and row coordinates of the starting word. Across means that the coordinates are for across word, otherwise - down word.

Save Unfinished will force the generator to save any current grid when it is stopped for some reason.

Show Generation means to show a window displaying the status of the current grid. Nonstop if not checked, will pause the generator on any solution found, and you should click on 'Start' to continue. Draw 3D defines how the grid is drawn. If it is not checked the diagram will be drawn in black and white. Words Numbers will show the word numbers in the grid. Grid Numbers will show the column and row coordinates. Upper Case will force the conversion to upper case of the shown letters. Delay will stop in any step the generation for a small amount of time. Steps to Refresh defines the frequency of refreshing the status window.

If the generator is paused and you start this dialog, you can change some of the options, and when you continue the generation, the parameters will be reflected. The options, which can not be changed in pause mode will be disabled. One more option will be shown in pause mode: Cancel Current means "cancel the generation for the current source grid and continue with the next source grid".

 

5.4 Clues Mode

When in Browse tab the CPT Mode is Clues, you are working with the CPT Clues generator and you will see the following:

Dimensions fields show the current size (columns by rows).

The Result field shows the number of items in the current set.

'Filters for selection' button will start the Tags As Filters dialog where you can set the filters for selection of clues from the Base Clues.

'Generation algorithm options' button will start the Algorithm Options dialog.

More Dictionaries allows including additional CTree dictionaries in the search process. Use 'Add New' button to set the path of the file. 'Remove Selected' button will delete the selected entry. Note that the searching in additional dictionaries is quite slow, because they are scanned sequentially and any word is converted to 'crossword form' before the match test.

During the generation you can use the 'Stop' button to cancel the generator.

If you have selected Interactive mode in Algorithm Options dialog, the Target Crossword dialog showing selected data will appear. When the generation process is finished, the generated number items will be set in the Result field. If the generation fails, an error message will be shown or a warning message if some clues/answers were not found.

Algorithm Options Dialog

Mode Tab

Interactive Mode will show the Target Crossword dialog, where you can browse/edit all of the generated data.

Show Tags will add information for the tags in front of the clue/answer text. This information is not included in the final text.

Include Answers and Include Title Data say to include these items as well in the crossword. The answers are the optional presentations of the words that could be shown in the printout. If in Base Clues and in the additional dictionaries there is a clue with tag 'xa', it is taken as answer, otherwise, the program will convert to upper case the first letter of the word and the result will be the answer. You should take care about the acronyms and multiple word names, because this approach obviously is not correct.

Reject Source Data means that any clues, answers or title data from the source will be ignored. If not checked, the data from the source will be selected.

Unicode Target means to convert the crossword data to Unicode. You will need to check this, if the encoding of the source grid and the encoding of the clue dictionaries are not compatible.

The Clue Selection group shows the simple strategies used to select the clues. Use Filters means that any clue not matching the filters will be ignored. First Found will stop the search process on first found clue. Shortest will select the shortest clue from the list of clue candidates. If it is not checked, the first found will be the initial selection.

The selection itself is done in the following order: 1) the source data; 2) the Base Clues dictionary; 3) all additional dictionaries. The first letter of any clue/answer taken from the additional dictionaries is converted to upper case.

Title Data Tab

In the text fields Title, Author, and Copyright you can enter the default data to be included in any of the generated crosswords.

Target Crossword Dialog

If you click on 'Dismiss' button, the generation will continue in automatic mode. While you are in interactive mode, for any crossword you have to confirm the data using 'OK, save' button, and then the program will continue with the next one. To cancel the generator, use the 'Stop' button from the main window.

The words from the grid are shown on the left and the available data for the selected word will be shown on the right. When you change the selected word, the data on the right will be changed as well.

To change the initially selected clue, select the new clue from the list and click on Fix Clue. If you want to change the text of a clue, edit it in the text clue field and click on Fix Clue. The program will take the data from the text field only when the state of Fix Clue is changed from unchecked to checked.

Proceed the same way with the answer.

 

6. How To

Here you will find step by step procedures for the most important tasks.

One general remark: when you start a generation and you see immediately 'Stopped' message, this means that the selected parameters are wrong or not supported.

 

6.1 Generate Diagrams

Ensure that in Browse tab the CPT Mode is Diagrams.

Quick Start

No Source, size 15x15, NY Times style:

No Source, size 15x15, NY Times and Clues In style:

Use the same procedure as above but in filters select Clues In in Style list, set Check Style, and set 10 in Max Items (here we impose a hard requirement).

No Source, size 15x15, Black Grid style:

Use the same procedure as for the first sample but in filters set 35 in Max % Blacks, -1 in Max % Unches, 5 in Max Black Pattern, select Black Grid in Style, set Check Style and clear all No White flags.

More Examples

The best way to understand how the generator works is to start with simple examples.

'No Source' , size 5x5:

You will see the generated diagrams on a window which will appear on the right. These samples probably are not what we would like. So, close the display window, run the filters dialog, set Standard Symmetry, put 2 in Max Black Pattern, enter 25 in Max % Blacks and run the generation again. The result is quite better, we hope. Play with the filter settings to see the effect on the generation. Do not delete the Target - we will use it in the next example.

'Catenae', size 10x10:

You will see the catenation process in action. Now you can play with Flip, Rotate and algorithm number. For Algorithm 1 the transformation operations will not increase the combinations but will slow down the search process. For some combinations of the flags the generator will not find any result and will show for a while "Stopped" message.

You can select the result in the Source and repeat this procedure with bigger size (20x20).

'Catenae', size 15x15, NY Times style:

Without proper Source the result will be nothing or some small number of diagrams. To improve the output, we will use style wrapper - in Algorithm Options dialog set Style Wrapper. This could add some more diagrams to the result. To get better results, we have to select/generate carefully our building blocks. Fortunately, the library folder contains the proper sets. So, we choose All Libraries in Auto Source and run the generation again. After several seconds the new diagrams will begin to appear. It is a good idea before any long generation to check carefully the filters, e.g. set Max Word Length to 12, Max % Blacks to 20, No 2x2 Black and No 3 Black Corner should be checked. The essential properties for the style are always on and checked by the program in this mode.

Here is a table for some preferred input sizes for NY Times style with wrapper and standard symmetry used for generation:

Grid Size:

Input Size:

Grid Size:

Input Size:

Grid Size:

Input Size:

9 x 9

3 x 2

17 x 17

11 x 6

25 x 25

19 x 10

11 x 11

5 x 3

19 x 19

13 x 7

27 x 27

21 x 11

13 x 13

7 x 4

21 x 21

15 x 8

29 x 29

23 x 12

15 x 15

9 x 5

23 x 23

17 x 9

31 x 31

25 x 13

Catenae, size 17x17, Scandy style:

You can manually select the same cells (d16x16cells.dlz file) into the Source, set None in Auto Source and try other Target dimensions like 13x13, 15x15, and 15x21.

 

6.2 Generate Crosswords

Step 1. Select the Source for CPT Words generator

The Source type could be empty diagram or partially filled grid. The selection is usually made from Files folder (as shown in the sample below) or from Library folder. The number of the diagrams in the Source set is not limited.

In Browse tab select Folder Type 'Files', CPT Mode 'Words' and click on 'Start' button.

In CPT Editor's Folder Browser window select the file '77_a.ini', click on 'Select' button, click on 'OK, save' button, and click on 'Dismiss' button. Note that the old Source should be deleted in order to do the proper select operation.

Step 2. Run CPT Words generator

Select Target tab in the main window and click on 'Start' button (Run should be checked, and the proper options should be set in Algorithm Options dialog).

After the end of the generation close the grid status window (if shown), and optionally, you can save the new grids in a library file.

Step 3. Select the Source for CPT Clues generator

The Source type could be filled grid or a crossword with clues. In the sample below, the selection is made directly from the last generated grids. In general, you can use CPT Editor for the selection (as described in Step 1).

In Source tab choose Target in Select From, check the Select check box and click on 'Start' button.

Step 4. Run CPT Clues generator

In Browse tab switch the mode to Clues.

In Target tab check the Run check box and click on 'Start' button (the proper options should be set in Algorithm Options dialog).

After finishing the generation of clues, you can browse/print the new generated crosswords in Editor (click on 'Start' button with Show checked and Run unchecked). Note that any new generation will delete the old Target, and if you want to keep these crosswords, save them in a library file.

Fight Against The Exponential Complexity

The CPT Words generator is able to compile tens of grids in seconds, but this is not the case with more complex diagrams and huge word lists, where years of CPU time might be necessary to search the space of the possible variants.

Before following the list of hints, take in mind these notes:
When you have a small list of words as Base Words or you intend to make a long run, it is preferable to use the 'sound' approach - choose Algorithm 3 or set Max Backtracks to a big number and Local Backtrack off. You can see the difference using the file 'en134.wlb' as Base Words. This is a test word list of 134 words, which should give exactly 48 variants for diagram of size 5x5, no blacks. All algorithms in 'sound' mode and Max Common Words = -2 will find all variants, while in 'unsound' mode not all variants will be found.
The small size of Base Words could result in 'Out of words' message, the increasing of the size will give good chance to the program but will increase the complexity as well.

Hint 1. Start with fast tests

Choose alogrithm mode Default and Algorithm 1 or 2, Max Backtracks = -1 (or other small value), Local Backtrack on, sort Base Words with Descending and Cross Counters or Letter Frequency on. If there is no solution in several seconds, stop it, choose other algorithm or increase Max Backtracks and start again.

This is the default mode, which can give results in short time even for complex diagrams and big Base Words.

Hint 2. Reorder Base Words

Choose Sort Base Words with Random on and Seed = -1, start the generator and if there is no solution (or reasonable progress) in 10 seconds, stop it, and start again.

It is funny that we can fight against the complexity using random number generator, but the experience shows, that this often gives results. Many complex diagrams were solved this way (using hint 1 with random sorting, or Algorithm 1 in 'sound' mode - Max Backtracks is big number, and Local Backtrack is off).

Hint 3. Help the generator in run time

If the program is not able to get out of a 'cycle' for a long time, pause it, set Algorithm 1, Max Backtracks to 0 or 2 and run it again. When the program has overcome the 'hard place' in the grid, pause it, set Max Backtracks to a big number or switch to Algorithm 3 - this will help it to keep the words found.

Hint 4. Help the generator with the grid

Fill in by hand the longest words and some areas in the grid, choose the starting word.

This might sound like "don't use the program at all", but actually, the high quality crosswords are created this way - you have to choose the interesting theme words by yourself and leave the details to the compiler (this is true for the clues as well).

 

6.3 Make New Base Words

Note: this procedure will destroy your current Base Words (see the notes after Step 3).

Ensure that in Browse tab the CPT Mode is Words, and return to Base tab.

Step 1. Set encoding and locale of Base Words

Click on 'Set New Base Words Encoding' button and set the encoding and the locale in the dialog window.

Step 2. Select input for Base Words

The input could be from a CTree dictionary (a good candidate is your Base Clues dictionary, if it contains enough words) or from a text word list. The format of the text word list is just a word per line. You can use one of the spell checking lists available on Internet, or you should create it.

Check one of the 'Select ...' radio buttons, set the file path and the filters for CTree or the encoding for the text word list.

Step 3. Start the creation

If you have set some filters, Use Filters should be checked. If you want to append the input to the current Base Words, Append should be checked.

Check the Select check box, and click on 'Start' button.

After the finish you should close the Messages window. It is a good practice to save the new Base Words in Base Words folder after any creation (click on 'Save' button).

 

6.4 Make New Base Clues

Note: this procedure will destroy your current Base Clues (see the notes after Step 3).

Ensure that in Browse tab the CPT Mode is Clues, and return to Base tab.

Step 1. Set encoding and locale of Base Clues

Click on 'Set New Base Clues Encoding' button and set the encoding and the locale in the dialog window.

Step 2. Select input for Base Clues

The input could be from a CTree dictionary (not recommended, if you want to maintain the tags) or from a text file in 'Text Dictionary' format. The text format is described in the appendix. It is quite a tedious task to create this file, but the alternative is for any new crossword to type the same or similar clues again and again.
The creation process requires a CPT Tags file as well. This file is supposed to be in 'locale' directory, and having the name 'your_locale.tag', where 'your_locale' is the ISO locale code. If there is no such file, the 'default.tag' file will be taken.

Check one of the 'Select ...' radio buttons, set the file path and the filters for the CTree or the file path and the encoding for the text file.

Step 3. Start the creation

If you have set some filters, Use Filters should be checked. If you want to append the input to the current Base Clues, Append should be checked.

Check the Select check box, and click on 'Start' button.

After the finish you should close the Messages window. It is a good practice to save the new Base Clues in Base Clues folder after any creation (click on 'Save' button).

 

Appendix A: File Formats

A.1 Text Dictionary

The text dictionary format should have the following strict field's order (even for RTL texts stored in visual order):
word | morpho-tags | user-tags | topic-tags | clue-tags clue
For the creation of CTree with clues you can have more then one line per word (all these lines should start with the same word). Here is an example of a word entry for Base Clues:
accelerator|0|0|0|xc Device for controlling speed
accelerator|0|0|0|xa Accelerator
In this sample the answer (it is the second 'clue' having 'xa' tag) actually is not needed because it will be obtained by the default rules of the program. But if the word is all uppercase acronym or a multiple word name, the answer should be given. If there are several clues and answers per a word, any answer should follow the corresponding clue.

A.2 Tags

The format is quite complex and if you are really curious, check the documentation of CPT Word Lists. In 'locale' directory you can find some examples like 'enSample.tag'. Here is the sample of the minimal Tags file you will need ('default.tag' - optional tags are commented out).
Cp1252
# Morphology Tags
#<morpho
# ...
#>

# User Tags
#<user
# ...
#>

# Topics Tags
#<topic
# ...
#>

# Clues Tags
<clue
0  # 0 code unused
xc  clue
xa  answer
>

# END of all tags
Appendix B: Language Support

Letter Case

The default letter case of the words is lower. The dictionaries are created in lower case only, while the CPT Editor supports upper case as well. The special casing (Greek, German, ...), which maps one lower case character to 2 or 3 upper case characters, is supported in the display. For example, you can use the small German letter es-zed in the crossword grid, and if Upper Case is checked, 'SS' will be shown in the letter cell. The reverse is not true and that's why the default letter case is lower.

RTL Scripts

In the dictionaries and in the crosswords the RTL text should be in logical order.
The CPT Editor and the generators (Words, Clues) support 'right-to-left diagrams' as well. You can convert a diagram to RTL in Additional Properties dialog in CPT Editor via checking RTL Numbers. Note that these diagrams have data format 'Grids+'. CPT Diagrams generator does not produce directly RTL diagrams but you can save the Target as new RTL library via checking RTL Numbers in New Library dialog.

The text field controls support bidi processing without jumping selection for all Java versions when the proper RTL check box is set. More details you can find in "Language support" appendix in the documentation of CPT Word Lists.
We have to note that the dialog windows are LTR oriented even when they contain bidi enabled controls, and the meaning of the keyboard keys is always LTR as well.

Encoding

If the encoding you are using is not in the display list, use User Encoding dialog in CPT Editor to try to include it as user defined 8-bit converter. For example, the encoding ISO8859-13 (used by the Baltic languages) is not in the list but it is supported by the recent Sun's Java RTE (although I would suggest to recode your source data to other encoding). If there is 'single character per letter' problem, it could be solved using custom converters (like VN1 converter for Vietnamese).

Thai

The program code contains a surprise not documented elsewhere. When creating new Base Words in Thai language, if you select Unicode as encoding, the custom Thai Unicode normalization will be switched on. The default mode is 'Single Cells' (more details you can find in the documentation of CPT Word Lists). If a crossword is in Thai and in Unicode, the program will assume the custom normalization. For this reason the data recoding to/from Unicode is disabled.


top of page  |  cpt index
© 2002-2004 CPT Software. All rights reserved.
1