dcn ver 0.3

by Jeremy O’Brien

University of Chicago

 

based on the research of

John Goldsmith, University of Chicago

 and Gary Larson, Wheaton College

 

Documentation and How-To

 

The newest version of dcn will be integrated into Linguistica. When the newest version of Linguistica is released, take a look at the new documentation.

 

Introduction:

dcn is a simple program that simulates Dynamic Computational Networks, or DCNs. A DCN is an artificial intelligence computer network that can be used to model many phenomena in human languages. The application uses DCNs to model quantity-insensitive stress systems, but this is not the only possible use. For more information on DCNs, see John Goldsmith’s website. Furthermore, if you are interested you should visit Max Bane’s DCN website. Max has been doing some fantastic recent research on DCN’s, in collaboration with myself and Jason Riggle, among others.

dcn is written in C++, using the GUI toolkit Qt. It is open-source software. The program requires about 24 MB of free memory, and about 100 MB of free disk space. A fast processor is helpful but not necessary.

 

Downloading the program:

     The application is available for Windows and Mac OS X. If you need, you can use the source code and the free version of Qt to compile it for any supported platform.

 

How to use:

(Windows version: Make sure the qt dll is in the same folder as the executable file.)

 

 

            Open the application. A dialog box will open (like the one above), with all the options of the program.

 

Computing:

            The application is capable of taking parameters for the DCN, and giving the appropriate stress system for those parameters.

            In the upper left hand corner are the Parameters for the DCN. You can modify the values for alpha, beta, intial, and final. Then click on Compute to find the corresponding stress system (which we can call a corpus). The corpus will appear in the right side, in the large text field labeled Corpus or Stress System. A 1 represents a stressed syllable, and a 0 represents an unstressed syllable. Words of syllable length 2 to 15 will appear in the text field. Depending on the network values, some or all of the words in the corpus may not converge, and the corpus display will reflect this with a not converged warning.

 

Learning:

            The application is also capable of doing the opposite of computing – that is, it can learn parameters for a DCN given a corpus. It should be noted that, currently, the learning algorithm is only capable of learning negative values for alpha and beta.

DCN learning is not a one-to-one mapping, because a single corpus may result in millions of possible sets of values for the DCN. The learning algorithm will find one of these sets of values, and will display them in the ‘Parameters for the DCN’ field. To use the learning algorithm, fill out the corpus field with words containing 1’s and 0’s. Make sure the last line is in the corpus field is a blank line, or else the calculations will be off. It is also important that none of the words are contradictory. For example, if there are two 3-syllable words in the corpus, they must be the same in terms of stress. If one of the words is, for example, 001, and the other is 101, then the learning algorithm will not be able to find appropriate values for the DCN.

            The default Parameters for Learning Algorithm might be sufficient for you, but you will probably want to alter them. A discussion of what each value corresponds to is in the Discussion section of this how-to.

            Once the corpus is entered properly, and the learning parameters are how you want them, click the Learn button. You might need to wait a while, depending on the speed of your computer. On a one- or two-year old PC, this may take up to 45 seconds. On older computers, this might take longer, but the learning algorithm will eventually stop. If the algorithm is unsuccessful, the learning algorithm failed indicator will be on. You might want to alter the parameters and try again, or it could be the case that your corpus is un-learnable by this version of the learning algorithm.

            If the algorithm is successful, the learning algorithm successful indicator will turn on, and the values that it found will be placed in the Parameters for the DCN fields in the upper-left. You can even verify this by clicking Compute, and the corpus will reflect the values of the DCN.

            A log file is created (DCNlog.txt), containing information on the last run of the learning algorithm. It lists all the appropriate values at each iteration of the learning algorithm, and therefore can be quite large. This information gives a great deal of insight into how the algorithm behaves in particular situations.

 

Examples:

 

Weri:   Final plus leftward alternations

Corpus:

01

101

0101

10101

010101

1010101

01010101

101010101

0101010101

10101010101

010101010101

1010101010101

01010101010101

101010101010101

Parameters:      alpha -0.7,        beta -0.3,         initial 0.3,          final 0.5

 

Warao:  Penult plus leftward alternations

Corpus:

10

010

1010

01010

101010

0101010

10101010

010101010

1010101010

01010101010

101010101010

0101010101010

10101010101010

010101010101010

Parameters:      alpha -0.8,        beta -0.2,         initial -0.6,        final -0.7

 

Garawa:  Penult plus leftward alternations; Initial (trochee)

Corpus:

10

010

1010

01010

101010

1001010

10101010

100101010

1010101010

10010101010

101010101010

1001010101010

10101010101010

100101010101010

Parameters:      alpha -0.7,        beta -0.2,         initial 0.3,          final –0.5

 

Discussion:

            The learning algorithm is a simplified version of Gary Larson’s DCN learning algorithm, as explained in his 1992 dissertation. This algorithm uses the concept of simulated annealing – a metaphor from metallurgy that gives a search strategy to find (with some luck) the global maximum. We use the idea of Temperature, as a quantitative measure for how sure we are that we have the right values. The higher the temperature, the more likely we are to make larger changes of the values. When the temperature nears zero, we cool off, allowing us only to make tiny changes of the values.

            Below is the pseudocode for the learning algorithm, using the variables from the Parameters for Learning Algorithm fields shown above in the screenshot.

 

Repeating for # of Trials to Attempt

  Start with Starting Alpha, Starting Beta, Starting Initial and Starting Final

    Repeating for Max Steps in One Trial or until T is very small

      Take a word from the corpus

      Using the alpha, beta, initial, and final values, see if the network predicts this word

      If incorrectly predicted:

         Change alpha, beta, initial, and final each by

               (T * random number from -.5 to .5)

         alpha and beta are not allowed to be positive

         the new values must converge for words of syllable length 2 to 15

               or else they aren’t changed (still under construction)

         T := T + Add When Wrong

      Otherwise, if correctly predicted:

         T := T * Multiply by When Right

 

            By including all the above bolded parameters in the program window, the user is able to manipulate the fine points of the algorithm while not having to alter the source code.

            In previous research, my fellow researchers and I found that DCNs exhibit rather unusual behavior when alpha and beta are greater than zero, especially when both of them are greater than zero. For that reason, and in order to help limit the search space of the learning algorithm, for the time being we are only dealing with negative values for alpha and beta. This may be changed in future versions.

            Note that some values, while they might seem learnable at first glance, are in fact not learnable in actual execution. For instance, DCN values that depend on the number zero are not obtainable, because the algorithm semi-randomly walks around the search-space. For instance, the value of alpha might become very small (e.g. 0.004), but it will never reach zero, because for that to happen, (T * random number) would have to equal exactly 0.004 (or the floating point equivalent), and this would never happen.

 

Further Questions:

            Please feel free to contact me by email (address at the top of the page). I am open to comments and suggestions from anyone interested in DCNs and computational models of human language.