The Yoshikoder is a cross-platform multilingual content
analysis program developed as part of the Identity
Project at Harvard's
Weatherhead Center for
You can load documents, construct and apply content analysis dictionaries, examine keywords-in-context, and perform basic content analyses, in any language. Here's a screenshot.
The Yoshikoder works with text documents, whether in plain ASCII, Unicode (e.g. UTF-8), or national encodings (e.g. Big5 Chinese.) You can construct, view, and save keywords-in-context. You can write content analysis dictionaries. Yoshikoder provides summaries of documents, either as word frequency tables or according to a content analysis dictionary. You can also apply a dictionary analysis to the results of a concordance, which provides a flexible way to study local word contexts. Yoshikoder's native file format is XML, so dictionaries and keyword-in-context files are non-proprietary and human readable.
Please contact wlowe [at] gov.harvard.edu if you'd like to contribute to the project.