Resources

You may need to right click to save these files.

Dictionaries

Please note that these dictionaries are available courtesy of their authors and translated into Yoshikoder format by me. They are not necessarily in the public domain unless their authors agree, and the Yoshikoder's open source license implies nothing about what you can do with them. For all these sorts of questions, please ask the authors.

If you are the author and want to have a link updated or removed, or you want to correct a conversion error I have made, please contact me.

LaverGarryAJPS.ykd Laver and Garry's dictionary, from 'Estimating policy positions from political texts', American Journal of Political Science 44 pp.619-634.
Note This dictionary supercedes previous dictionaries mounted here.
LIWC
The Linguistic Inquiry and Word Count dictionary is available, for research purposes only, directly from: James Pennebaker. See also the LIWC homepage
RID-en.ykd Colin Martindale's Regressive Imagery Dictionary (English). All versions of the RID on these pages are translations of the Wordstat files at Provalis Research.
RID-fr.ykd Regressive Imagery Dictionary (French) translated by Robert Hogenraad
RID-pt.ykd Regressive Imagery Dictionary (Portugese) translated by Tito Cardoso e Cunha, Brigitte Detry, and Robert Hogenraad.
RID-sw.ykd Regressive Imagery Dictionary (Swedish) translated by Torsten Norlander, Moira Linnarud, Marika Kjellén-Simes, and Robert Hogenraad.
RID-de.ykd Regressive Imagery Dictionary (German) translated by Renate Delphendahl.
nd_finance.ykd A collection of word lists from Bill McDonald for processing financial reports.

Tokenizers

The Yoshikoder can use plugin tokenizers for languages where built-in tokenization is insufficient. Currently an experimental tokenizer plugin for simplified chinese is available - based on code by Erik Peterson.

SCTokenizer.jar for Simplified (Mandarin) Chinese

Document Conversion

You might find the Yoshikoder Converter useful for converting web, MS Word and PDF documents into plain text before analysis.

SourceForge.net Logo