RESOURCES FOR TEXT ANALYSIS
WEB-BASED AND DOWNLOADABLE CONCORDANCERS
- AntConc
A multi-purpose corpus analysis toolkit designed for conducting corpus linguistics research and data-driven learning.
https://www.laurenceanthony.net/software/antconc/
- Key-BNC: School of Liberal Arts, King Mongkut’s University of Technology Thonburi
A program for quickly performing basic keyword analyses of a corpus compared to the BNC.
Online version https://key-bnc.tfiaa.com/
Offline version http://crs2.kmutt.ac.th/Key-BNC/
- Corpus-Based Engineering English Materials: School of Liberal Arts, King Mongkut’s University of Technology Thonburi
An interactive website with activities for students learning English for engineering. Lots of fun and challenging activities.
- Sketch Engine
A tool designed for text analysis or text mining applications.
http://www.sketchengine.eu/tools-for-text-analysis/
- Wmatrix
A corpus analysis tool with a web interface to the English USAS and CLAWS corpus annotation tools, and standard corpus linguistic methodologies such as frequency lists and concordances.
http://ucrel.lancs.ac.uk/wmatrix/
- English-Corpora.org
A web portal giving access to a variety of widely used corpora.
https://www.english-corpora.org/corpora.asp
- #LancsBox: Lancaster University corpus toolbox
#LancsBox is a software package for the analysis of language data and corpora developed at Lancaster University.
http://corpora.lancs.ac.uk/lancsbox/
ANNOTATION TOOLS
- CLAWS part-of-speech tagger for English
http://ucrel.lancs.ac.uk/claws/
- UCREL Semantic Analysis System (USAS)
http://ucrel.lancs.ac.uk/usas/
OTHER RESOURCES
The following websites offer comprehensive lists of useful tools for text linguistics and corpus linguistics.
- All About Corpora
https://allaboutcorpora.com/corpus-software-2
- Bodleian Libraries, University of Oxford
http://ox.libguides.com/c.php?g=422982&p=2888571
- Corpus-analysis.com
- Illinois Library, University of Illinois
https://guides.library.illinois.edu/c.php?g=405110&p=5804542
- Laurence Anthony’s Website
http://www.laurenceanthony.net/software.html
- School of Humanities and Sciences, Stanford University
https://linguistics.stanford.edu/resourcescorpora/corpus-tools
- W3-Corpora Project, University of Essex
https://www1.essex.ac.uk/linguistics/external/clmt/w3c/corpus_ling/content/software.html
THAI LANGUAGE ANALYSIS TOOLS
THAI WORD SEGMENTATION PROGRAMS
- LexTo
- TLex
- LexTo+ (Dictionary based, Longest matching)
https://aiforthai.in.th/service_bn.php
- TLex+ (Machine learning, Conditional Random Fields)
https://aiforthai.in.th/service_bn.php
- Thai word segmentation
http://161.200.50.2/wordsegment
- Thai syllable segmentation
http://161.200.50.2/sylsegment
- Thai chunks
CORPUS
- LST20 Corpus
A corpus of 3,164,002 words for Thai language processing developed by National Electronics and Computer Technology Center (NECTEC), Thailand. It offers five layers of linguistic annotation: word boundaries, POS tagging, named entities, clause boundaries, and sentence boundaries.
https://aiat.or.th/lst20-corpus/
- Thai National Corpus
A general corpus of 14 million words which is designed to be comparable to the British National Corpus. This corpus is created by Department of Linguistics, Faculty of Arts, Chulalongkorn University.
http://www.arts.chula.ac.th/~ling/tnc3/
CONCORDANCE
http://www.arts.chula.ac.th/~ling/ThaiConc/
by Wirote Aroonmanakun, Chulalongkon University
SENTIMENT ANALYSIS TOOLS
- Sentiment Analysis
https://aiforthai.in.th/service_sa.php
by NECTEC
- S-Sense: Social sensing
WORDLIST
http://www.arts.chula.ac.th/ling/tnc/searchtnc/
POS TAGGERS
- TLex++
(The users can use TLex++ to segment Thai text and tag each word part of speech proceeded by Machine learning techniques with Conditional Random Fields algorithm.)
https://aiforthai.in.th/service_bn.php
- Thai POS Tagging
SPEECH TO TEXT
Partii (Its service is to convert speech sounds into text.)
https://aiforthai.in.th/service_st.php
by NECTEC
OTHER RESOURCES FOR THAI LANGUAGE ANALYSIS
http://thainlp.wannaphong.com/p/corpus.html
https://thailang.nectec.or.th/archive/indexdca0.html?q=node/21
https://aiforthai.in.th/index.php
https://saki.siit.tu.ac.th/thainlp/