DCN Stress

by Jeremy O’Brien

University of Chicago

 

based on the research of

John Goldsmith, University of Chicago

 and Gary Larson, Wheaton College

 

Documentation and How-To

 

 

Introduction:

DCN Stress is a computational model for quantity-insensitive stress systems. The model is built on the idea of Dynamic Computational Networks, or DCNs. A DCN is an artificial intelligence computer network that can be used to model linguistic phenomena, including stress and sonority/syllabification (DCN Syllabification). Both DCN Stress and DCN Syllabification are built into Linguistica.

For more information on DCNs, see John Goldsmith’s website. Furthermore, if you are interested you should visit Max Bane’s DCN website. Max has been doing some fantastic recent research on DCN’s, in collaboration with myself and Jason Riggle, among others.

 

Downloading the program:

     DCN Stress is part of Linguistica. Download the Linguistica application for your operating system (Windows, Mac OS X, or Linux).

 

How to use:

(Windows version: Make sure the qt dll is in the same folder as the executable file.)

 

 

            Open the application. In the main window, there will be a set of tabs, labeled Command Line, Graphic Display, DCN Stress, and DCN Syllabification. Go to the third tab, DCN Stress.

 

Computing:

            DCN Stress is capable of taking parameters for the DCN, and giving the appropriate stress system for those parameters.

            In the upper left hand corner are the Parameters for the DCN. You can modify the values for alpha, beta, initial, and final. Then click on Compute to find the corresponding stress system (which we can call a corpus). The corpus will appear in the right side, in the large text field labeled Corpus or Stress System. A 1 represents a stressed syllable, and a 0 represents an unstressed syllable. Words of syllable length 2 to 15 will appear in the text field. Depending on the network values, some or all of the words in the corpus may not converge, and the corpus display will reflect this with a not converged warning.

 

Learning:

            DCN Stress is also capable of doing the opposite of computing – that is, it can learn parameters for a DCN given a corpus. It should be noted that, currently, the learning algorithm is only capable of learning negative values for alpha and beta.

DCN learning is not a one-to-one mapping, because a single corpus may result in millions of possible sets of values for the DCN. The learning algorithm will find one of these sets of values, and will display them in the ‘Parameters for the DCN’ field. To use the learning algorithm, fill out the corpus field with words containing 1’s and 0’s. Make sure the last line is in the corpus field is a blank line, or else the calculations will be off. It is also important that none of the words are contradictory. For example, if there are two 3-syllable words in the corpus, they must be the same in terms of stress. If one of the words is, for example, 001, and the other is 101, then the learning algorithm will not be able to find appropriate values for the DCN.

            The default Parameters for Learning Algorithm might be sufficient for you, but you will probably want to alter them. A discussion of what each value corresponds to is in the Discussion section of this how-to.

            Once the corpus is entered properly, and the learning parameters are how you want them, click the Learn button. You might need to wait a while, depending on the speed of your computer. On a one- or two-year old PC, this may take up to 45 seconds. On older computers, this might take longer, but the learning algorithm will eventually stop. If the algorithm is unsuccessful, a dialog box will open up telling you so. You might want to alter the parameters and try again, or it could be the case that your corpus is un-learnable by this version of the learning algorithm.

            If the algorithm is successful, a different dialog box will open up, and the values that it found will be placed in the Parameters for the DCN fields in the upper-left. You can even verify this by clicking Compute, and the corpus will reflect the values of the DCN.

            A log file (DCNlog.txt) is created in the same folder as Linguistica. This file contains information on the most recent run of the learning algorithm. It lists all the appropriate values for each iteration of the learning algorithm. This information gives a great deal of insight into how the algorithm behaves in particular situations. The file can also be very large, so an application for opening large files might be necessary (e.g. WordPad for Windows or TextEdit for Mac OS X).

 

Examples:

 

Weri:   Final plus leftward alternations

Corpus:

01

101

0101

10101

010101

1010101

01010101

101010101

0101010101

10101010101

010101010101

1010101010101

01010101010101

101010101010101

Parameters:      alpha -0.7,        beta -0.3,         initial 0.3,          final 0.5

 

Warao:  Penult plus leftward alternations

Corpus:

10

010

1010

01010

101010

0101010

10101010

010101010

1010101010

01010101010

101010101010

0101010101010

10101010101010

010101010101010

Parameters:      alpha -0.8,        beta -0.2,         initial -0.6,        final -0.7

 

Garawa:  Penult plus leftward alternations; Initial (trochee)

Corpus:

10

010

1010

01010

101010

1001010

10101010

100101010

1010101010

10010101010

101010101010

1001010101010

10101010101010

100101010101010

Parameters:      alpha -0.7,        beta -0.2,         initial 0.3,          final –0.5

 

Discussion:

            The learning algorithm is a simplified version of Gary Larson’s DCN learning algorithm, as explained in his 1992 dissertation. This algorithm uses the concept of simulated annealing – a metaphor from metallurgy that gives a search strategy to find (with some luck) the global maximum. We use the idea of Temperature, as a quantitative measure for how sure we are that we have the right values. The higher the temperature, the more likely we are to make larger changes of the values. When the temperature nears zero, we cool off, allowing us only to make tiny changes of the values.

            Below is the pseudocode for the learning algorithm, using the variables from the Parameters for Learning Algorithm fields shown above in the screenshot.

 

Repeating for # of Trials

  Start with Starting Alpha, Starting Beta, Starting Initial and Starting Final

    Repeating for Max Steps Per Trial or until T is very small

      Take a word from the corpus

      Using the alpha, beta, initial, and final values, see if the network predicts this word

      If incorrectly predicted:

         Change alpha, beta, initial, and final each by

               (T * random number from -.5 to .5)

         alpha and beta are not allowed to be positive

         the new values must converge for words of syllable length 2 to 15

               or else they aren’t changed (still under construction)

         T := T + Add When Wrong

      Otherwise, if correctly predicted:

         T := T * Multiply by When Correct

 

            By including all the above bolded parameters in the program window, the user is able to manipulate the fine points of the algorithm while not having to alter the source code.

            In previous research, my fellow researchers and I found that DCNs exhibit rather unusual behavior when alpha and beta are greater than zero, especially when both of them are greater than zero. For that reason, and in order to help limit the search space of the learning algorithm, for the time being we are only dealing with negative values for alpha and beta. This may be changed in future versions.

            Note that some values, while they might seem learnable at first glance, are in fact not learnable in actual execution. For instance, DCN values that depend on the number zero are not obtainable, because the algorithm semi-randomly walks around the search-space. For instance, the value of alpha might become very small (e.g. 0.004), but it will never reach zero, because for that to happen, (T * random number) would have to equal exactly 0.004 (or the floating point equivalent), and this would never happen.

 

Further Questions:

            Please feel free to contact me by email (address at the top of the page). I am open to comments and suggestions from anyone interested in DCNs and computational models of human language.