Parsing text files

From Computational Statistics (CSE383M and CS395T)
Jump to navigation Jump to search

Future exercises will require you to process large data sets in the form of structured text. This exercise is designed to ensure that you are comfortable doing this in the programming language of your choice.

A chess game can result in one of three different outcomes: either white wins, black wins, or there is a draw.

A PGN file (description of format) containing information about 10,000 (real, mostly high level tournament) chess games can be found at this location. (If you use the IPython notebook server, the file is accessible at path './data/first_10000_games.pgn'.)

Read the format description to figure out how to extract the outcome of each game from the text, then count the number of times each of the 3 possible outcomes occur in the 10,000 games.