KDD-D / KDD-TCoNLL 2008Unsupervised Dependency Parser for English, version 1.0ptbconvThe LTH Constituent-to-Dependency Conversion Tool for Penn-style TreebanksSemeval 2010 task Coreference Resolution in Multiple Languages' Italian corpusDESR dependency parserMPQA subjective lexiconJointly Extracted and Compressed TAC 2009 Document SetsLDC Arabic Penn TreebankMaltArabic syntactic dependency parser model with functional morphology featuresDUCNTCIR 08 Training DataLEXICAL NORMALISATION ANNOTATIONS FOR SHORT TEXT MESSAGESPDTB parserMovie Review Sentiment Polarity DatasetMPQAEuroparl (WMT 2006)DiversityEnron QueriesNIST 2009 MT Evaluation Set (Urdu-to-English)Crowdsources translations, edits, and rankings for the 2009 NIST Urdu-to-English datasetCambridge Learner Corpus - First Certificate in English exam scriptsSVMlight, SVMperf, KNP, Moses, FREQTCzEng09Stanford Dependency ParserUNT Computer Science Short Answer Dataset v2.0LingPipe Sentence ChunkerBritish National CorpusEnglish Gigaword CorpusMessage Understanding Conference 4 - Terrorism CorpusThe Konan-JIEM Learner CorpusEuroParl corpus v 6.0LFA-11Baidu ZhidaoBCCWJThe extended REX-J corpusthe testing data of the Chinese Word Sense Induction task of CLP2010TAC-KBPSUC 2.0UIUC question classification 5500 dataEuroparl, News commentaryEDR dictionaryTREC 2010 Entity TrackGGJUMANDarpa TIDES surprise language dataset + internally collected data2006 CoNLL Shared TaskContact Center Data (call speech data, call logs, summaries of the speech data)Chinese word segmentation corpus from the Second International SIGHAN Bakeoff data setsQuaero BN Named Entity CorpusPenn Chinese Treebanks and Chinese gigawordChinese TreebankNihongo-Goi-TaikeiPrinceton WordNet 3.0Assert parserKyTeaDUC 2005Reviews-9-productsLDC GigaWord CorpusName entity recognition corpus from the Fourth International SIGHAN Bakeoff data setsNTUSDKadokawa-Ruigo-Shin-JitenUKB: Graph Based Word Sense Disambiguation and Similaritynaist-jdicseminerlda: Collapsed Gibbs Sampling Methods for Topic ModelsSnowball StemmerThe BeastKyoto University Case FramesLibSVMThe Multi-Domain Sentiment DatasetMPQANihongo Goi Taikei: Japanese LexiconMainichi newspaperNew York TimesJsafranPENTAtrainer.praatAMI meeting corpusAMIAurora 2Aurora2Aurora 2 databaseBABEL Hungarian Speech DatabaseCSLU Names v1.3CMU Let's Go CorpusCMU ArcticSPEECONEXMARaLDA Partitur-EditorHIWIREISIP environmental noise signalsSpeechDat(II) ITMARY TTSQuaero Named Entities Training SetSpeechDat(II) ESSpeechDat(II) SFSpeechDat(II) SZThe EMIME Mandarn/English Bilingual DatabaseTIMITWSJCAM0 British English speech databaseTerraUBY-LMFUBYliblinear 1.51MLSA A Multi-layered Reference Corpus for German Sentiment AnalysisMMPC: A Multiparty Multi-Lingual Chat Corpus for Modeling Social Phenomena in LanguageFRMGAsiya-OnlineCLaRK SystemCLaRKRecon: Annotation Tool for Concepts and RelationsArabic Gigaword (Fourth Edition)Arabic Treebank (ATB) of Broadcast NewsArabic Treebank: Part 3 v 3.2Arabic-English Parallel Aligned TreebanksASAASIt - (Atlante Sintattico d'Italia, Syntactic Atlas of Italy)AAC - Austrian Academy CorpusPHONOLEXBasic dictionary of FinSL example text corpus (Suvi) Basic dictionary of FinSL example text corpus (Suvi)BYU-BNCcalbc corporaCMDI Information PageReconcile - Coreference Resolution EngineCOWCorpus DEFTCINTIL PropbankCroatian Inflectional Lexicon MOLEXXLEL-21BulTreeBank-DPBulTreeBank: Syntactically annotated corpusDICOVALENCE 2DiCoInfo - Dictionnaire fondamental de l'informatique et de l'InternetDrugs@FDAAhoEmo3English CHILDES Verb Construction DatabaseEnglish Gigaword (Fifth Edition)feat - Flexible Error Annotation ToolFoLiATXTcollectorGenChal RepositoryGOLD: General Ontology for Linguistic DescriptionDeWaCGLOSSIRHFST toolsIWSLT 2011 parallel TED CorpusjMWEJRC Eurovoc Indexer JEXkthxc - KTH eXtract CorpusLOB corpusLG-evalAhoSynLearner Corpus of Hungarian (tentative name)LEGO: Lexicon Enhancement via the GOLD OntologyMAZEAMIDIKIPOWLAMARC-2000 Modern Arabic Representative Cortpus 2000MorphoAdorner name recognizerParTUTMultiUNMPQA sentiment lexiconNKJPNIdent-CAOfficial Europarl test set from WMT 2008.PANTERAPaCo2: Parallel Corpora CollectorLEGO (Parameterized & Annotated CMU Let's Go Database)PerTreeBankPALinkAPORTMEDIA DomainRTE2 Test SetCRPC Modality sampleSrpRec - Serbian morphological electronic dictionarySrpRecExt - The extension of the Serbian morphological electronic dictionarySrpWN Serbian WordnetSWift - SignWriting improved fast transcriberSimMetricsSMI Remote Eye Tracking Device.AhoSpeakersSPPASSoNaRSwedish Kelly list, frequency-based vocabulary list for language learnersLefff 3.0AQUAINTBlastCONCISUS CORPUSDGT-TMFLOBOPUSTree-TaggerUKPMCVPs-30-EnW2C Web CorpusWebCAGeWAGGERWOLFOpenMaTrExFinnTreeBank 1ASA-PLEuroparl corpusGLPKJRC-AcquisCLUTOSETTDT3PELECAN: PRONUNCIATION ERRORS FROM LEARNERS OF ENGLISH CORPUS AND ANNOTATIONCECOS: A CHINESE-ENGLISH CODE-SWITCHING SPEECH DATABASECReSTSemeval 2010, Task-10 test and training dataBijankhanSVMLight-TKPro3GresIUPAC CorpusDARE: domain adaptive relation extractionEurparlHCRC Maptask CorpusThe Icelandic Frequency DictionaryJRC-NamesMPQA corpusMPQAOMICSRSTToolSRILMStanford TaggerSuggested Upper Merged OntologyPenn II TreebankTuebaD/ZUMLSUMLS MetathesaurusConVoteEnglish/Hindi and English Arabic Gold Standard for TransliterationKDD-D / KDD-T DatasetsCoNLL 2008 Shared Task DataBrown Coherence ToolkitUnsupervised Dependency Parser for English v 1.0ptbconv-3.0Penn TreebankNEDNAIST Text CorpusKyoto Text CorpusSemEval 2010 Coreference Resolution Task CorpusTextProChaSenDeSRCaboChaMovie Review DataMulti-Domain Sentiment DatasetSubjectivity LexiconFrameNet 1.5SEMAFOR 2.0WordNetDIRTSherlockTPEAraucariaTAC 2009 Document SetsArabic Treebank (ATB)Corpus of Arabic Functional MorphologyMADAMaltParserArabic Syntactic Dependency Parser ModelColumbia Arabic Treebank ConverterDocument Understanding Conference (DUC) CorpusWMT 2010 Translation Task DataNTCIR 2008 Training DataLexical Normalisation Annotations for Short Text Messages (LexNorm)Penn Discourse TreebankPenn Discourse Treebank ParserAutomatic Text Labelling of TopicsLarge Movie Review DatasetTarget-dependent Twitter Sentiment Classification AnnotationMPQA Opinion CorpusNTCIR Opinion CorpusTAC KBP 2009 DataTDT5 Brown Word ClustersEuroparlJoshuaCharniak parserACE 2005CCGBankSAMT extensionEvaluation Data for Hyponymy Relation MiningGoogle N-gram corpus Web 1T (2006)Diversity in Collective DiscourseQuery Dataset for Email SearchNIST 2009 MT Evaluation SetCrowdsourced translations, edits, and rankings for the 2009 NIST Urdu-to-English datasetSwitchboard CorpusCLC FCE DatasetWikipediaNELL Ontology and Knowledge BaseMicrosoft Research Video Description CorpusMADA (Morphological Analysis and Disambiguation for Arabic) tool kit.LinGO Grammar MatrixTAC KBP Annotation and Assessment GuidelinesNUs Corpus of Learner English (NUCLE)MosesSemeval 2010 word sense induction and disambiguation datasetHindi Projected Treebank from English-Hindi Tides Parallel CorpusCzech-English Parallel Corpus (CzEng)MapTaskEnron EmailsEvaluation Annotated/Unannotated data and evaluation code for NP coordination disambiguationStanford ParserUNT Computer Science Short Answer Dataset v 2.0LingPipeEnglish Incremental Right-Corner Grammar for HHMMupparseDeceptive Opinion Spam Corpus v1CELEX2British National Corpus (BNC)Classification of News Articles on Contentious IssuesPenn Chinese TreebankHebrew and Arabic Morphologically SegmentedEnglish GigawordMessage Understanding Conference (MUC) 4 Terrorism CorpusKonan-JIEM Learner CorpusCoNLL-X and CoNLL 2007 datasets20-NewsGroupsWebKBXinhua ChineseStanford Log-linear Part-Of-Speech Taggerir4qa_evalService Quality Evaluation Data SetMinna no Hon'yaku (MNH, Translation for ALL)ANERcorp + Our Own Corpus20newsgroupReuters-21578SpamassassinScale datasetSimple Rule Language Global Health RulebookSimple Rule Language EditorBioCaster OntologyRen-CECps 1.0NTCIR's Japanese patent document corpusNEU-Restaurant-ReviewRestaurant-Review-Snyder and Barzilay (2007)ROUGE-1.5.5breakSent-multi-lf.plSemCorChinese CCGbankChinese Penn Treebank 6.0Web 1T 5-gram corpus30 noun pairs from Rubenstein and Goodenough, and by replacing them with their definitions from the Collins Cobuild dictAmazonCNProductReviewsFreeLangMeSH (Medical Subject Heading)Multilingual glossary of technical and popular medical termsFIRE 2010 dataenglish to hindi dictionary shabdakoShaUnsupervised incremental parserNEGRAChinese PennTreebankWSJ Penn TreebankChinese Proposition BankPDTBIRNA newspaper text corpusAryanpour Persian to English dictionaryUSENET corpusMATEUkwabelana corpusDocument Understanding ConferenceROUGEBNCThe Switchboard-1 Telephone Speech CorpusTREC-8 collectionLucenewekaAn unsupervised incremental parser (CCL)Chinese product reviewsNTU sentiment dictionaryAppraisal lexiconMPQA subjectivity lexiconSentiWordNetMovie review data setStanford POS taggerReutersISOLETEuroparl v3NICT JEL corpusTo be announcedSzeged LVC CorpusmorphStanford Lexicalized ParserMorfessorCELEXLingua::JA::Summarize::ExtractMeCabText summarization corpus for the credibility of information on the WebTSUBAKIProp BankPennTree BankFBIS Corpustest set containing non-compositional and compositional phrasesEnglish_VPNtest set for QA candidate rankingmstparserHowNet Knowledge DatabaseMaltConverterChinese Treebank 5.0movie review datasetbooks, DVDs, electronics, and kitchen appliancesPT-EN /EN-PT translation lexiconYahoo! Answers QA Pairs under Healthcare DomainUGC tokenizerPortuguese Twitter corpusCLEF 2009 test collectionsNLTKDoshisha eye-gaze dialogue dataBerkeley ParserukWaC corpusCQPTDT4Evaluation Benchmark for Bilingual Lexicon ExtractionFreelingChinese Temporal Annotation Data SetChinese TreebankBrandeis Annotation ToolILSP/ELEFTHEROTYPIA MODERN GREEK CORPUSNTCIR patent corpusICTCLASarXMLivFresaBrill's TaggerEnglish Penn Treebank, Chinese Penn TreebankSentence Re-ranker based on Information Extractionanswer selection datasetRTE data setACE 2004 training dataJenaStanford NERAQUIANTTREC QuestionsChinese emotion lexicons(five emotions)Chinese Language Technology PlatformHowNetCOAE2008-task3Cross-lingual event predicate clustersevent annotated ontonotesEvent Annotated Carbon Sequestration data?????(Internet lexicon SogouW)Web 1T 5-gram Version 1natural language toolkitWordNet 3.0Yahoo! web searcherFlorianpolisWordNetBR or TEPBTECad hoc tasks of TRECOpenNLPOpenCalaisKEABrown corpusTycho Brahe parsed corpusTSUBAKI document collectionMulti-media Multi-lingual concept, relation and event annotated corpusConcept mapping table between video and textSemeval 2007 English Lexical Sample corpusChasen Japanese Language parserNTCIR-8 Mainichi Shinbun 2005Semeval 2010 Japanese Lexical Sample corpusTREC AP corpusPorter stemmerDeliciousVerbOceanNLTK for PythonFrameNet and WordNetBioInferMoguraIEPAAkaneREHPRD50AIMedLLLCKIPStanford Named Entity RecognizerNTCIR-8 Patent Translation dataJUMANMainichi NewspaperDUC 2007 Summaries and Pyramid annotationsBilingual corpus on patent domainComplexChineseQAtestdataDegExtNTCIR CLQA Chinese QuestionsThe Penn Chinese Treebank 6.0DUC-2006, DUC-2007 dataPenn Chinese Treebank 6.0Dan Bikel’s randomized parsing evaluation comparatorBayonKNPNIST MT 03-06 training and test corporaXinhua of GigawordBi-sentences,lexicon LDC2005T34,Name Entity LDC2005T34NIST MT 03-05 training and test corporaText Analysis ConferenceGIZA++SogouTManipuri POS taggerNamed Entity Recogniser for Manipuri using SVMPOS.LMManipuri StemmerSRILMManipuri-English Parallel CorpusYAMCHAStanford Dependency ParserTinySVMNISTmorphaJournalisticNL11CTB6Lefff 3.0Word Relatedness DatasetsANNODIS corpusGoogle Web1T corpus (LDC2006T13)IMDB actorsUMLSEuroparl and News-Commentary corporaReview referencesPeoples Daily from 1993-1997BLOGS06TREC corpus Disk4&5WT10gTreeTaggerGerman LFG grammarTiGerCoNLL 2000 shallow parsing data setEnglish Chinese Translation Treebank 1.0Chinese Treebank 6.0500M Japanese Sentences on the WebJeuxDeMots lexical networkMultilingual Statistical Parsing EngineEvent ExtractorEnjuCharniak-Johnson reranking parserC&C ToolsThe LTH Constituent-to-Dependency Conversion Tool for Penn-style TreebanksGDepGENIA treebankBioNLP'09 shared task data setPWKPEnglish WikipediaTDT corpusAlpinoTwente Newspaper CorpusThe Prague Dependency Treebank 2.0English-Korean Parallel CorpusTIDES Extraction (ACE) 2003 Multilingual Training DataKorean RDC corpusPeople's Daily CorpusUyghur to Chinese MT corpus (UCC)Bengali NEWS Editorial Opinion CorpusBengali Blog Opinion CorpusSentiWordNet (Bengali)MPQAMUC6Matlab SOM-ToolboxWordNet 2.0ir packageJWNLCorpus of Interactional Data (CID)A Large English-Chinese Parallel CorpusLinguistic Data ConsortiumBengali NEWS Editorial OpinionUyghur Encyclopedia (UE)target datasetLTPFNE datasetFZ NER ToolShared Swedish/English Regulus grammarEnglish Gigaword Fourth EditionFBIS and MTC data setsBLEUSynmttkGerman dependency treebank with new automatic featuresA Uyghur Tokenizer and part-of-speech taggerChinese Penn TreebankWikipedia (English)Wikipedia (German)dict.cc lexiconUyghur parsing corpusJuliusSimultaneous Interpretation DatabaseClause Boundary Annotation ProgramDUCEncycloMedical Subject HeadingsDutch Sentiment Lexicon (adjectives)DECA Species CorpusSimulated Contact Center DialoguesSecond International Chinese Word Segmentation Bakeoff DataFive PPI CorporaChinese Learner English CorpusPropBankChinse Verb Error Evaluation CorpusEnglish Gigaword Second EditionEncarta treasuresChinese Proposition Bank 1.0Bikel parserMSNBC News test setYahoo! News Resolution SetListening-oriented DialoguesIban-English LexiconIban corpusQTagIban-Malay LexiconWorNet 3.0GATEEmotion holder AnnotatorEmotion Blog CorpusWordNet AffectVerbNetCRF ChunkerEmotion Topic annotated blogAffect databaseSentiFulSentences from Experience ProjectConnexor Machinese SyntaxWMT 2010 system combination task corpusBerkeleyParserAAC-Austrian Academy CorpusWikipedia dumpTDT3, TDT4, TDT5Simple English WikipediaEnglish WiktionaryGCIDEMicrosoft Research Question Answering CorpusSimple English WiktionaryOmegaWikiSUCREFeature-GroupingEnglish TreebankVarro ToolkitCIPS-eval dataFreebaseEnglish - Bengali Parallel CorpusMalt ParserFrench Tree BankGH-MAPCoNLL Shared Task 2009 CorpusPenn Chinese Treebank 5.1Tsinghua Chinese TreebankMPQA Opinion Corpus version 1.2 with additional judgmentsSpanish-English EuroparlVerb Noun ListVerb OceanSogou Query LogAOL 2006 Query LogCLEF corpusWeb ServiceEnglish Web as Corpus (ukWaC)Penn Chinese Treebank 5MPC -- multii-party chat corpusmovie review data subjectivity datasetsGALE Y1 Q2 Release - LDC/FBIS/NVTC Parallel Text V2.0Revenue CorpusSRI language model toolkitCRF++Stanford Chinese Segmentermteval scoring scriptJoshua - open source hierarchical phrase based systemEvent and non-event nouns test-setStanford NLP POS taggerEMMA - Evaluation metric for morphological analysisEnglish to UNL Corpusidiomatic sentences test datasetWan's keyword extraction datasetKyoto University Text CorpusA simple C++ library for maximum entropy classificationpeccoHyponymy extraction toolChinese-English Translation Lexicon Version 3.0Eijiro, Third EditionEDR Electronic DictionaryWanfang Data Chinese-English Science and Technology Bilin-gual DictionaryEDICTMainichi Shimbun CorpusmmaEnglish Gigaword Corpus Fourth EditionNews Tweets For SRLTimeBankCCG SRL toolRussian corpusDMoZ corpusACE 2007BioCreative 2 Gene Mention Recognition CorpusCoNLL 2003 NER shared task corpusSCD-based Dictionary Entry ParserEuroparl with multilingual synsetsCooperative Remote Search Task (CReST) corpusMCPGGyaanNidhiGyaan NidhiLexiconEILMT parallel corpusACESRL annotated data for UrduSALTOHebrew annotated documentsmulti-score summarizerTAHAnon-native speaking data dataMicrosoft Web N-gram ServicesWiktionaryBritish National CorpusCDS datasetGold Standard for Sentence ClusteringEmotiBlog corpusEmotiBlog annotation modelTIGER CorpusEmotion Annotated data for UrduDUC 2002 Summarization CorpusWordNet Sense RelateHPSG-WSJMarkus DickinsonHindi Dependency TreebankSoftwareConsumerMeterA Corpus of Plagiarised Short AnswersPAN-PC-09Multi-Domain Sentiment Dataset (version 2.0)Rapidminer 4.6JWI (the MIT Java Wordnet Interface)Penn Discourse Treebank 2.0RST Discourse TreebankUKB: Graph Based Word Sense Disambiguation and Similarityydta-yanswers-manner-questions-v1.0MG4JLibsvm20 Newsgroups Document Categorization datasetR52UIUC Question classification DatasetThe Multimillion Q&A Pair CollectionMicroblog/Twitter Summarization Data SetJavabased MaxEnt packageIJCNLP-08 NERSSEAL Shared Task DataNamed Entity Annotated DataWeb-based Bengali news corpusRTGenGenISpatial patterns datasetMRP Readability CorpusReuters Corpus annotated with NP coreferenceSentiProductLexiconTERGenSemUrdu Resource GrammarInspec Database Keyword Extraction Data SetDocument Understanding Conference Past Data for Text SummarizationASKNetConceptNetPicture Books OntologyCoreference resolution data in opinion mining domainBAF corpusEuroparl (Manually annotated)Unsupervised, Language Independent Sentence AlignerNE listsBKB (BKB-nytfootball-v0.7.5)BioNLP'09 Shared Task on Event ExtractionCMC ICD coding corpusObesity datasetMorphadorner2007 Computational Medicine Center Challenge CorpusTextual Entailment Specialized DatasetSelf-Annotation ToolOpenNLP ToolkitFSParGerman Wikipedia articlesWordNet 2.1SenseLearner 2.0Stanford POS Tagger 1.6Stanford Names Entity Recogniser 1.1SIGHANTreebankAdaptiveCorefNgram Search EngineHansard Corpus, Public Release of Haitian Creole Language Data by Carnegie Mellon, FBISCelebrityOpen Directory Project Full CorpusWePS benchmark dataEvaluation of BioExcom on the BioScope corpusEmotiNetFormality Word ListsICWSM 2009 Spinn3r Blog DatasetmwetoolkitGeniaMWE 2008 data setsTiger treebankReuter-2157820-NewsgroupPenn TreeBank 3, Switchboard corpus partIndic Language Transliteration DataEnglish Gigaword CorpusBeijing language acquisition corpusJ KalitaItalian CCG Treebank (CCG-TUT)ERGenju HPSG parserMEDLINE databasesentence taggeranonymized-for-blind-reviewlongest_nereeMedT_NERLexeedRomanized Text Language IdentifierWMT 2009 datasetSinica Corpus of Modern ChineseGeneral EnquirerCnet product reviewsArabic Penn TreebankACE 2005 ArabicACE 2005 Handwritten ArabicLDC2008T19DUC2002Generative Semantic Parsing Model using Hybrid Tree FrameworkWASP and WASP^{-1}Robocup sportscasting corpusWeb Dataset for Text-based Image Annotation DevelopmentPTB NP Bracketing Data 1.0Google V2 and n-gram toolsComputerWorldEnglish Entity Detection and Tracking corpus for 2004 ACE projectMultilingual MPQAPerseus Latin Dependency TreebankRWSCorPMC Open Access SubsetC&C toolkitKorean emotional speech corpusKorean TV drama scriptsOntology created from Wikipedia Animal articlesChinese Emotion Corpusoracle database qa forumThe 4 Universities' datasetDish Names in Chinese Language Blog ReviewsWikipedia Vandalism Corpus WEBIS-VC07-11Question RankingChinese-English sentence level aligned bilingual corpusMICA20 newsgroupsReuters RCV1ChineseBookDescriptionWithTagsChineseBlogsWithTagsDORISpointingApril10TSUBAKI CorpusKoeling et al. (2005) corpusKanji TesterKanji Tester response logsJMdictChinese law articlesICTCLAS(Institute of Computing Technology, Chinese Lexical Analysis System)Chinese acedemic papersBioScopeUW parallel meeting corpusdict.ccopen thesaurusdingde-newsJWPL/JWKTLTranslation AnnotatorJRC AcquisKulkarni Name CorpusTWA sense tagged dataAcl Anthology Network (AAN)GermaNetPenn Tree BankStockholm Ume CorpusCHILDESJapanese WikipediaMMSEGNLTK packageText_Classification_Reuters_CorpusA thesaurus of argument structure for Japanese verbsMEDLINE/PUBMEDDBLPWorld Atlas of Language Structures (WALS)FrameNetSemEval2010-task 10Tagged Medical Forum DataChinese Collocation Dictionary of Content WordsTaKIPIBTaggerIPI PAN Corpus of Polish (manually disambiguated part)Yahoo!AnswersTest dataAmazon Reviewsi2b2 2009 shared task on medication extractionIREX data set for NE recognitionWikipedia Infobox ExtractsMUC Coreference Data SetACE-2 data setCoNLL 2005 DatasetFATE corpusFN transformerShalmaneserDUC 2002 DatasetKyoto Text Corpus version 4.0mogura HPSG parserBLIPP 1987 & 1988 corpusPenn Treebank 3.0Chinese Treebank 5.1NTUSDTYPO (misspelling) CORPUSMedisysAnnotated corpusOpinosis Summarization Demo SoftwareTopic Related Review SentencesMEAD Summarization ToolStanford's NLP ParserAmazon Mechanical TurkHercules DalianisRelevant Term extractorWikiXMLDumpTextExtractorSighan 2005 bakeoff dataSogouT CorpusAutomatic Content ExtractionLangid.pyMPQA DatasetBiomedical Gene Mention Linking CorpusGraded Compositionality Scores for Compound Nouns"Yahoo! Chiebukuro" dataUofT Blog CorpusJapanese National Pension Law CorpusProbabilistic Word Classes with LDALanguage Function Analysis Corpus (LFA-11)LGLexBaidu Zhidao CorpusAmazon.com Review Rating Prediction DatasetBalanced Corpus of Contemporary Written Japanese (BCCWJ)NTCIR-3 WEB (Web Retrieval Test Collection)Extended REX-J CorpusSemEval 2007 Lexical Substitution Task DatasetBS Computer Science CorpusMicrosoft Research IME CorpusCLP2010 Testing Dataset of the Chinese Word Sense Induction TaskKnowledge Base Population Corpus (TAC KBP)Customer Review DatasetJapanese Extended Named Entity CorpusStockholm-Ume Corpus (SUC) 2.0UIUC Question Classification Data (Training set 5)TREC Entity 2010Digital Review Data SetChinese Web 5-gramBasic Travel Expression Corpus (BTEC)North American News Text CorpusMMAXa User-Extensible Morphological Analyzer for Japanese (JUMAN)Darpa TIDES Surprise Language DatasetFTA Labeled ACL Anthology AbstractsSandhi Parallel CorpusGALE LDC Parallel DataGeppettoNatural Language Programming CorpusHyderabad Dependency TreebankCoNLL 2006 Shared Task DataWeb 2.0 TreebankCross-Language Entity Linking Test CollectionContact Center DataCantonese-Mandarin Parallel CorpusOpenMWEErgbioQuaero Broadcast News Named Entity CorpusYUWEI CorpusNihongo Goi TaikeiHindi TreebankWord Clipping Test SetMarkov thebeastMorfetteSupervised Latent Dirichlet Allocation for ClassificationAutomatic Statistical SEmantic Role Tagger (ASSERT)Kyoto Text Analysis Toolkit (KyTea)Document Understanding Conference (DUC) 2005 DatasetAdditional Review Datasets (9 products)LINA-PAL 1.0Sandhi RulesNamed Entity Recognition Corpus from the Fourth International SIGHAN Bakeoff Data SetsNTU Sentiment Dictionary (NTUSD)Sogou User Input RecordKadokawa Ruigo Shin JitenGraph Based Word Sense Disambiguation and Similarity (UKB)NAIST Japanese DictionarySemisupervised Named Entity Recognizer (SemiNER)Collapsed Gibbs Sampling Methods for Topic Models (lda)SnowballKyoto University's Case Frame Data 1.0Transcoding Sanskrit Formatsa Library for Support Vector Machines (LIBSVM)General InquirerMulti-Domain Sentiment Dataset 2.0Steady Selling Product Review DatasetLIBLINEARMainichi Newspaper DatabaseNew York Times CorpusJava Syntaxico-semantic French Analyser (J-Safran)WSJ0-WSJ1A Praat script for extacting pitch targets from vocal signals (PENTAtrainer)Augmented Multi-party Interaction (AMI) meeting corpusAURORA Project Database 2.0 - Evaluation PackageBABEL Hungarian Speech DatabasesCentre for Spoken Language Understanding (CSLU) Names v1.3CMU Let's Go DataCMU_ARCTIC speech synthesis databasesCzech Speecon databaseExtensible Markup Language for Discourse Annotation (EXMARaLDA) Partitur-EditorHIWIRE (Human Input that Works In Real Environments) databaseInstitute for Signal Processing (ISIP) environmental noise signalsItalian SpeechDat(II) Modular Architecture for Research on speech sYnthesis Text-to-Speech System (MARY TTS)Quaero named entity corporaSpanish SpeechDat(II)Swiss-French SpeechDat(II)Swiss-German SpeechDat(II)The Accents of the British Isles (ABI-1) Speech CorpusThe EMIME Mandarin/English Bilingual DatabaseTIMIT Acoustic-Phonetic Continuous Speech CorpusWSJCAM0 Cambridge Read NewsTransducersaurusKLAIR ToolkitObject-Based Seech RecognizerPhonetisaurusSyllitestRussian Emotional Corpus (REC)Boston University Radio News CorpusQuaero Extended Named Entities annotation guideVery Large Pronunciation Vocabulary for RussianAurora Project Database - Revised Aurora Noisy TI digits database - (Version 2.0)AhoVoicedDBAMI corpusUtsunomiya University Spoken Dialogue Database for Paralinguistic Information StudiesBoston Radio News CorpusKALAKA-2SpeechDat(II) ENWinPitchJapanese phonetically-balanced word speech databaseLIPS 2008 AV CorpusNIST LRE 2007EMU Speech Database SystemWashU-UCLA Corpus of Subglottal AcousticsAKUEMEnglish Read by Japanese (ERJ) databaseSpeech Feature Tool available at Centre for Speech Technology University of EdinburghTest Audio Data for Repeatition DetectionTranscriberWitchcraft WorkbenchDomainEditorHOESIElectropalatographic corpus for Standard ChineseRomanian Speech Synthesis corpusKALAKA 2The Edwardians: family life and work before 1918Quaero Named Entities evaluation toolAnnotation guidelines for Dutch-English word alignmentGold Standard corpus for Dutch-English word alignmentHandAlignxpressive Speech Labeling Tool Incorporating the Temporal Characteristics of EmotionNIMITEK CorpusMulti-Lingual Image captionsCatchWord Speech SynthesiserLexique et grammaire de drivationA fragment of Northern Sotho grammar: The verb of Northern SothoPeykare Or Textual Corpus of Persian LanguageStuttgart Finite State Transducer ToolsWikipedia (Turkish section)TRmorphZemberek spell checker word listMETU Turkish CorpusText+Berg CorpusAttribution Corpus of ItalianISSTMMAX2AVALONBase de datos sintcticosCorpus of temporal-causal structureThe English-Swedish-Turkish Parallel TreebankLink GrammarProject documents ontologyCLANNWordsmith toolsUnitexCalendar Expression Semantic TaggerNaviTexteAlborada-I3A corpus of disordered speechCharlatan Synthetic Dialog CorpusEEP search interface evaluationReal-word error corpusA list of confusion setsWitchcraftNo nameCorpus Chaines de Reference (CoChainRef)TTLCoreference Named Entity (CoRefEN)Coreference Chain Genre dependent identification module (CoRefGen)DPC: Dutch Parallel CorpusA Parallel Corpus of Monologues and Expository DialoguesOntoLing's ontologiesLAF/GrAFMETHONTOLOGYWebODEOntoTag's ontologiesOntoLing annotation modelA taxonomy of discourse (coherence) relationsPolish websites corpusWikipedia MinerOpenThesaurus2009's Text summarization corpus for the credibility of information from the WEBLX-ParserLX-Parser WebserviceLX-ServiceAdd MS KitStandards for Controlled LanguagesCourse material, writing manual and evaluation techniquesCourses on the writing of safe and safely translatable alert messages and protocolsMULTEXT-East Version 4JOSSloWNetFidaPlusSlovene Term Extractordifferential semantics synset annotationMultimodal Russian Corpus (MURCO)Tree-to-tree alignment tool Lingua-Alignbilingual corpusKalashnikov 2K dependency bankFISCALDBMARITERMItalWordNetCMT Corpus of Maritime terminologyCFT Corpus of Fiscal TerminologyEuroWordNetWordNet 1.5 and 3.0SINDACDBCST Corpus of Synsicate-labour terminologyVOLEMDictionary of Affect in LanguageSentiWordNet 1.0.1XuxenBfomaPELCRA Search Engine for the National Corpus of PolishAnotatorniaText Encoding Iinitiative (TEI)National Corpus of PolishPoliqarpFAUSSCINTIL Logical Form BankCINTIL DeepGrambankCINTIL TreebankCINTIL Dependency BankCINTIL CorpusCINTIL PropbankmorphistoGERTWOL/GERGENStripey ZebramOLIFdeDiabasegrande grammaire du franaisTagged and Cleaned WikipediaProprietory Interactive Voice Response CorpusNear-Identity Relations for Coreference (NIDENT)Annotated Corpora (AnCora)OntoNotesUtility Evaluation for Information ExtractionSonarText Encoding Initiative (TEI)Gold Standard for Dutch sentiment bearing adjectivesMulticultural Romanized Name MatchesCorpus of Ambiguous Abbreviations and Gene Names in the Biomedical DomainTRIPS OntologyTRIOS-TimeBank corpusHebrew-English transliteration dictionaryService-Finder Automatic Semantic AnnotatorJapanese Lexicon AcquirerJapanese Web CorpusHebrew CHILDES corpusSpeechDat, Callfriends, Broadcast newsGTAAdbpediaEASYLEXIndex Thomisticus TreebankCommentExtract 1.0STExAssociative Concept Dictionary (ACD)Associative Concept Dictionary for VerbsTest Bench for transliteration of Indian language to EnglishNomage lexiconFrench TreebankVerbactionNomage CorpusLUNA corpus of conversational speech in ItalianThe NumGen (Generating Numerical Expressions) CorpusQuæro QA corpusMorphOzWEB-QA and TREC-QAGRANSKA taggerSUCCogFLUXEXMARaLDAFOLKERdrhumanEster 2 Named Entity CorpusQuæro named entity corporaMainichi News PaperAozora BunkoLinES guidelinesI*LinkMachinese SyntaxLinköping English-Swedish Parallel Treebank (LinES)Alpaco alignment editorUrdu Transliteration ToolsAMI meeting corpusISST-TANL Dependency Annotated CorpusTurin University Treebank (TUT)AppraiseSpeech Recordings for Unit Selection CorpusEvaluation Tool for Subjective Loudness PerceptionSIGHAN Bakeoff 2006 Chinese Word Segmentation DataVALLEX 2.5TrEdPDT-VALLEXPML-TQPrague Dependency Treebank 2.0DanNetNeue Zrcher ZeitungHaGenLexRegression-Forest TaggerPUNKTPreposition Noun CombinationsPiNERDbt & FaccetteLymba's abbreviation dictionaryKYOTO-architecture / Knowledge Yielding Ontologies for Transition-based Organization’ - architectureSrebrenica corpusBitParPAC (Predicate Argument Clustering)RCV1Semantic spacesAppraisal Lexicon (lexique de l'valuation)ApopsisDeCoSoNaR Named Entities AnnotatierichtlijnenSpeeralESTER 2005 database development setGigawordIMAIL-SSIA-2009ZAPIwebGSEDC (Gold Standard for Event Detection in Croatian)Genia corpus annotation for BioNLP/NLPBA 2004Czech Morphological Analysis in PDT 2.0TectoMTCross Language Evaluation ForumIR Multilingual Resources at UniNEtrec_evalNon-projective dependency parsing using spanning tree algorithmsISST-SSTBasque WordNetEDBL : Lexical database for BasqueEULIA: Tool for Morphological AnnotationEustagger: lemmatizer/tagger for BasqueEPEC (Reference Corpus for the Processing of Basque).AbarHitzEUSEMCORlibiXMLBasque Dependency Tree Bank (BDT)EiheraSyllabeur-v2.1.jarWord Sketch Grammar for RussianSketch EngineIRASubcatWikicorpusUKBSenSemPIITHIE corpusJRC Acquis Latvian-English parallel corpusMARIEEnglish-Spanish Large Statistical Dictionary of Inflectional FormsHIFI-AVGUM-3-SpaceGUM-Space: Evaluation Data and Annotation InstructionsTools for querying an N-gram databaseTools for web-scale N-gramsMPI/DOBES Language Resource ArchiveSlovenian Lombard Speech DatabaseMuLeXFoRPukWaCWaCkypedia_enDEXONLINENEOROMRoMorphoDictLuconMapudungun-Spanish MT test suiteMapudungun-Spanish AVENUE Machine Translation Grammarlist-question-answering pargraphsPANChinese Opinion TreebankOpinion Annotation Tool (OAT)MNH SDF corpusStockholm EPR Corpus / Speculative clinical textOfficial Documents of the Congress of Deputies in XML formatAfrican WordNetCASIA-CASSILLarge Vocabulary Thai Continuous Speech Broadcast News corpus (LOTUS-BN)TLexs: Thai lexeme analyserGranular Time Ontology for Temporal UnderspecificationThe D-TUNA CorpusAnCora-Nom-EsAnCora-Verb-EsAnCora-EsADN-ClassifierBase de Franais MdivalNouveau Corpus d'AmsterdamNotaBen RDF Annotation ToolGiellatekno parserNameDatNordisk Sprkteknologi (NST) corpusOnomastica interlanguage pronunciation lexiconText handlerChamber debatesCIAIR Back-Channel Utterance CorpusCIAIR in-car speech corpusGerman Voice Services Agender DBDutchParlGernEdiTLEX corpusLEX monolingual corpusEUR-LEX translation memoryCollection of newspaper articlesThe Prague Dependency Treebank 2.0 (PDT 2.0)Message Understanding Conference (MUC) 6Mor?eSemantic Annotation Tool (SAT)D-Coi corpusCornettoCGNNijmegen Corpus of Casual Spanish (NCCSp)GIRASIGAVirtual Language WorldCORPRESFAU IISAHThe Quranic Arabic CorpusBAStatSyntactic Annotation Guidelines for the Quranic Arabic Dependency TreebankVariKN Language Modeling toolkitUniversal Declaration of Human RightsGoogle AJAX Language APIAyDASpanish2MSLDepartment of Education Text BooksAUTONOMATA Spoken Name Corpus (ASNC)AUTONOMATA-g2p-toolkitFine-Grain Morphological Analyzer and Part-of-Speech Tagger for Arabic TextSPECIALIST dTaggerMXPOSTHealth Information Readability CorpusCurran and Clark POS TaggerFunGramKB OnomasticonMicroKnowing: Microconceptual-Knowledge SpreadingFunGramKB LexiconFunGramKB GrammaticonCOREL: Conceptual Representation LanguageFunGramKB SuiteFunGramKB OntologyFunGramKB MorphiconWTIMIT 1.0Eindhoven CorpusThe CELEX Lexical DatabaseFrequentielijst 27 Miljoen Woorden Krantencorpus 1995broad-coverage lexical resource of ArabicPubMed CentralAVLaughterCycle databaseSmart Sensor IntegrationI-EN-SAMPLEKI-04SANTINISHGCMGCKRYS-ICorpus del espaolCorpus de Referencia del Espaol Actual (CREA)LPCC - a large parallel corpus of cleftsRetokenized EuroparlJava WordNet::Similarity (beta)JWPL-Java Wikipedia LibraryJava Statistical ClassesAnymalignUniversity of Maryland Parallel Corpus Project: The BibleThe Berkeley Word AlignerCC-CEDICTMGIZA++Bible Bilingual LexiconsMPROAUTOTERMAutonomata TOO native infrequent and multilingual speech corpusMainichi Newspaper ArticleBusiness News Story CorpusEvent and Sentiment Segmentation Gold StandardAustrian Phonetic Database (ADABA)ELAN annotation toolOntology for Equipping Upper and Domain Ontologies With TimeTREATTree taggerRitel-ncaXeros Incremental Parser XIPMulti-Annotated Corpus of Answers to Questions, MACAQveneto-english parallel corpusHMM-based dialogue annotationDihanaSwitchBoardUnicode CBETA ArchivesMeta-Knowledge Annotation Scheme for Bio-Events (MeKASBE)U-CompareGlossaNordic Syntactic Judgments DatabaseNordic Dialect CorpusA Python Toolkit for Universal TransliterationETS Textual Entailment Test Suite for the Evaluation of Automatic Content Scoring TechnologiesChinese Character Data (Hanzi Data)Etymological WordnetSUMOUnicode Character DatabaseGoogle TranslateISO 639-3UWNLa RepubblicaItalian FrameNetItalian Valence LexiconMultiWordNetNorKompLeksFonema TTS front endSpontal-NBulgarian National CorpusMAATSRFTaggerMB-TaggerCText Tagset for AfrikaansAfrikaans Word ListsAfrikaans Beeld CorpusTnT TaggerCallSurf ManTransTopics-140Test suite for biomedical ontology concept recognition systemsSwedish fuzzy wordnetELANAcademia Sinica Balanced Corpus of Modern Chinese (Sinica Corpus)Emotion Cause Event CorpusGF Resource Grammar LibraryDADLIPS: Lexical Isolation Point SoftwareCambridge Cookie-theft CorpusVeteran TapesAnnotated Corpus of Difficult-Antecedent Referring Expressions (DAREs)Reference Engine Development and Evaluation EnvironmentEnglish-Galician Europarl parallel corpusBoB (Bozen-Bolzano Library Bot) user dialoguesRomanian Russian WordNet-Affect RoRuWNAPOS-Tagged New Testament in Wolof (Matthew gospel)ProtgEcoLexicon CorpusWordsmithToolsSMORELexicon of negation cuesThe PDTB XML converterLUNA.PLPDT 2.0 annotation toolsSupeSense TaggerSemeval-3 Task 6MyTerMSTermFactoryADL NLP Analytic ToolWest African Language Archive (WALA)pn-filterTsumugi-1.0.1A ECA-MSA LexiconSpoken Dialogue OntologyRWTH-BOSTON-50RWTH-BOSTON-104RWTH-FingerspellingRWTH-PHOENIXWomens Studies EncyclopediaAbstracts of the 39 JournalsWomen's Studies International ForumAustrian Academy CorpuscorpusEditorAAC corpusBrowserJWPLWikipedia (the Spanish version)CoCoAuthorship Paraphrase CorpusCross-lingual WSD Benchmark Data SetFipsRomanianNTCIR-1 Test CollectionSpontalNBA video pages collectionGATE Access and Interpretation of SentiWordNetDifficult Speech Corpus (DiSCo)CREAGESTAccentuatorAccentological corpus of RussianLanguage GridPAN Plagiarism Corpus PAN-PC-09Plagiarism Detection EvaluationCost-Conscious Annotation Supervised by Humans (CCASH)Speech database for unit selection synthesis of Viennese varietiesClean English ACE 2005 Event Trigger CorpusLingNetREMBRANDTAriadneItalian Legal FrameNetQASTLE (Question-Answering Systems TooL for Evaluation)CLEF - QAST 2007-2009 Evaluation PackageThe Database of Catalan AdjectivesTagalog WiktionaryICL-SearcherPeople's DailyChinese Semantic DictionaryAEGIR lexiconSPECIALIST lexiconNomBankAfazio TestSuiteNomLexXTagComLexGLISSANDOTreebank-3Penn Arabic Treebank 2TIGER Corpus 1.0The ICSI Meeting CorpusPenn Arabic TreebankTwitter corpusESTERLefffPrague Czech-English Dependency TreebankAMICA Medical Dialogue CorpusTwo-level utterance-unit annotation schemeMST parser (maximum spanning tree parser)French treebank converted into dependenciesMElt (Maximum Entropy Lexicon-enriched Tagger)Annnotation OntologyOntology-based Semantic Annotation X CorpusKALAKACollective Action Framing CorpusRainbowThe Revised Chinese DictionaryChinese Bi-Character Words' Morphological Types CorpusBlack Bean Chinese Word Segmentation SystemNTCIR CIRB040NIST Open Machine Translation (OpenMT) EvaluationThe Stanford ParserSwitchboard corpus annotated with dialogue actsANNIEsMailsMail Speech Act Mining RulesSummTermCESTA Evaluation PackageCINEMOJEMOArbilTMEKO, Tutoring Methodology for the Enrichment of the Kyoto OntologyKyoto OntologyKyoto TermdatabaseGeneRegKyTea - the Kyoto Text Analysis ToolkitGoldstandard of German morphological analysisStuttgart MORPhology (SMOR)Hebrew-English parallel corpusLAMPADATerm-minatorGerman Reference Corpus DeReKoPiTaggerJapanese WordNetEDRlexiconAffectiveTask SemEval2007AffectiveTask SemEval2007 Subsets with Figurative Language AnnotationsPolArtTerm UnionUMLFSignCom Projectdoxa-jv-corpusEurogene systemEurogene Multilingual Genetic OntologyDekang Lin's Similarity ThesaurusesEdit Distance Textual Entailment Suite (EDITS)Grammar error ratioGranskaIMS-corpus collectionSELFEHpassage cpcv3LiveMemories Corpus for ItalianWikiWoodsXLE English ParGram grammar + AKRKurzprotokolle des deutschen BundeskabinettsCompanion WoZ corpusMultimodal Task-Based CommunicationControlled Language in Crisis Management (CLCM)Guidelines for Evaluators for MT output errorsspoken language samples for the South African official languagesBAWERitual Descriptions CorpusWSJCAM0 British English speech databaseIraqi Arabic Did You Mean...?Information Science Institute Elicited Imitation CorpusArabic Nonnative Speaker Pronunciation Error ModelDictionary of Iraqi Arabic (Arabic-English)MAGEAD: A Morphological Analyzer and Generator for Arabic and its DialectsJAPIO patent abstractsEDR bilingual dictionaryJST scientific paper abstractsFredPowerset Gold-Parsed Wikipedia CorpusModality LexiconConceptMapperTMCEuropean name modelsPersian WordNetLwazi TTS corpora (v Sep 2009)Lwazi ASR corpora (vSep2009Lwazi primary pronunciation dictionaries v1.0BabyExpUPC_ESMAAC/DCcorte-e-costuraRomanian and English corporaRomanian and English word form lexiconsASIt - (Atlante Sintattico d'Italia, Syntactic Atlas of Italy)Gold Standard for Spanish Mass NounsVerb ThesaurusCSJ SDR Test collectionCSJSenso Comunetest collectionsSzeged Dependency TreebankDiccionario Clave de Uso Del Espaol ActualDicionario del Real Academia EspaolaDicionario General de la Llengua EspaolaDictionaries of Swedish names and common wordsStockholm EPR PHI CorpusAhoTransfDriFTechnical-Lay paraphrase lexiconDiabetes and cancer monolingual comparable corporaMalagaPROMETHEUS DATABASEMinefieldRODRIGOGIDOC prototypeSpecialized Datasets for Textual EntailmentThe Leeds Arabic Discourse Treebank (LADTB)Arabic Discourse Annotation Tool (ADA tool)the Penn Arabic Treebank (Part 1 v. 2.0)TTSlabCS1RITELCatalan Version of a Subset of the EuroVocSpLaSHFrench FrameNet built with Bilingual Dictionaries : FR.FrameNet.BiDicSCI-FRAN-EURADIC Dictionnaire bilingue français-anglaisSemantic Map @ CEA LISTFR.FrameNetB3DBAnnotated RTE-5 Search Data SetAugmented RTE-5 Search Data SetSanchaySentiWSInternalShort Movie Reviews for Opinion Mining and Authorship Attribution ExperimentsAnnoSemUtoolextension of Pang and Lee: polarity 2.0poldata_RatingExtractorOntoSearcherLocalNERReference Corpus of Contemporary Portuguese (CRPC)International Corpus of Portuguese (CINTIL)POLYMOTSA Dynamic HPSG Treebank of the Wall Street Journal sections of the Penn TreebankCross-lingual relatedness thesaurusMaltEvalHyderabad Dependency Treebank (HyDT)BART: Baltimore Anaphora Resolution ToolkitAutoTutorTCFHuman Annotation for Machine TranslationDK-CLARIN LSP corpusSystem for integration Corpus Management, Processing and AnalysisEuroparl v.2Connexor machines syntaxEnglish-Swedish word alignment gold standardMaximum Likelihood Linear Regression (MLLR)Improved Automatically Trainable Recognizer of Speech (iATROS)Wall Street Journal (WSJ)Kemo/DemoNon-canonical constructions in oral discourse: a crosslinguistic perspective (NOCANDO)ModeLex CorpusCREAM TBXENOVAJeuxDeMots French lexical networkMASKKOTCzech Multi-channel Speech Database of DSP LecturesWikiNER (semantically annotated corpus for Catalan)QAST 2007-2009 evaluation packageGeoNamesClearTKLanguage TagMooney dataset: geoquery dataMAtrixware REsearch Collection (MAREC)French WikipediaQA corpus for answer justificationDysLsCorefProItalian WikipediaEntity Mention ClassifierSWiiTHunglish1984-EH-NPGIVE-2 Corpus Collection SoftwareGIVE-2 CorpusWAPUSK20GRIDIACDiachronic Corpus of SpanishUrdu Verbs Lexicocn BuilderLIMAECI Multilingual Text CorpusCoNLL 2003 Shared Task Named Entity dataSemiNERA common benchmark for text wikification toolsThe Wiki MachineDepPatternInfomatKTHNC - KTH News CorpusSaturnalia corpusRecAlignWordsPubMedBioGRIDDjangology: A Light-weight Web-based Tool for Distributed Collaborative Text AnnotationCAVaT - Corpus Analysis and Validation for TimeMLWordNet SimilarityGAWEXUMLS Specialist LexiconLink ParserSoNar IPR Acquisition ManualACE data 2007FrameNet transformerLegal CorpusTerm ExtractorArt History CorpusILC - NLP Statistical ToolsEnvironmental -Legal CorpusPAROLEPOMDELAFCorpus VALIBELLS-COLINWebSourdDicoLSFVenProPetit Larousse Illustr 1905SelectPOSArabic Treebank Part3 - Version 3.1Corpus SearchArabic Treebank Part 2 v 3.0TreeEditorArabic Treebank Part 5 V1.0XTRANSThe Bikel Statistical Parsing EngineSAMAArabic Treebank Part 1 v 4.0HunalignMorphological tagger of the Corpus of the Contemporary Lithuanian LanguageHungarian-Lithuanian parallel corpusJOS ToTaLe text analyserMorphological tagger of the Hungarian National CorpusHungarian-Slovenian parallel corpusJames PustejovskyData categories for communicative functionsHindi/Urdu TreebankSpanish FreeLing Dependency Grammar (EsTxala)Simplified Corpus Semantically Annotated with Wh-Question LabelsACE (Automatic Content Extraction) 2005 CorpusLDC2009E73 Standard Arabic Morphological Analyzer (SAMA) Version 3.1Arabic Treebank part 3 version 3.2 LDC Catalog Number: LDC2010T08De Mauro Paravia Dictionary of the Italian LanguageItWacItalian MWE databaseEngvallexPerDiPa CollectionLoonyBinIdentity Matching Adjudication Collector+: IMAC+TIGR Annotation Guidelines for Entity Extraction and Information Retrieval Ground Truth CreationCarafeTIGR Evaluation Methodolgy / GuideMALLETApproaches to Automatic Quality Estimation of Manual Translations in Crowdsourcing Parallel Corpora DevelopmentUniversal Word - Hindi DictionaryHindi WordNetPrinceton English WordNetGiza++ ToolFFV Spectrum Computation in Snack Sound ToolkitThe GENETAG corpusThe AIMed corpusThe GENIA corpusThe BANNER NER systemiMAP referring expression corpusMusicNavi2 databaseWS4LRGeolISSTermSrpRecSrpWNBehavior Language Corpus DatabaseSimultaneous Interpretation Database (SIDB)Domain product featuresSmartSUMORoget's ThesaurusWSD-IXAZT Corpusa - Zientzia eta Teknologia CorpusaMorfeusIhardetsiLDC wordlistan online Chinese-English dictionaryLemurTREC Genomics data from 2006 and 2007an online Chinese-English biomedical dictionaryan online Chinese MeSHDatabase of Narrative SchemasElixirFMNational corpus of spoken SlovenianFrench and English Contexonym DatabasesFrench and English Synonym DatabasesFrench and English Translation DatabasesAOL query logIDIAP corpus of political debatesCAREGIVEREnglish-Latvian comparable corpus extracted from WikipediaCorAlStockholm-Ume corpusRuneberg projectEuropean Parliament Proceedings Parallel Corpus 1996-2006Unified Eventity Representation (UER)SemInVeSt (Semantically Interpreted Verb-centred Structures)Lexical Markup Framework (LMF)Spanish Resource GrammarPassageCORPS - a CORpus of tagged Political Speechesreference corpusLT4eL English Learning ObjectsSwedish Scientific Medical CorpusCALL-SLTRegulusRascalli Gossip DBYAGOMaster Metaphor ListMetaphor CorpusPenn WSJ Treebank v.3FragmentSeekerVergina speech databaseWikiNetEnglish-Gujarati DictionaryEnabling Minority Language EngineeringMorphological Analyser for South Asian LAnguagesGATE Morphological Analyser (part of the GATE system)Japanese Particle CorpusIDIXRASPPlayMancer DatabaseAuthor Gender Analysis of TextWorld Wide English corpusGann - Graphical Annotation ToolC-3 (Coherence and Coreference Corpus)Pilot Arabic CCGbankSERASympalog SymRecErlangen Corpus of Speech Recognition TranscriptsFrench dysarthric corpusOntology Library for Intelligence DomainOntology Library for Financial DomainYou tubeErlangen Valency Pattern BankDARESProUTPragmatic Resources of Old Indo-European LanguagesDeWaCDISCOTAC 2009 Knowledge Base Population Track: corpora, knowledge base, guidelines, queries, and assessmentsTechnical domain lexiconAlignment of FrameNet and WordNetANNEXLAT BridgeThe Lemur ToolkitTRECEVALEuskaltermMorris hiztegiaCLEF Data Collections: LA Times 94, Glasgow Herald 95, topics and human relevance judgementsCzEngAZ-II corpusSAPIENT:Semantic Annotation of Papers Interface and Enrichment ToolCoreSC/ART corpusAZ-II annotation guidelinesCoreSC Annotation GuidelinesPrague Czech English Dependency TreebankYahoo! Local categoriesYahoo!'s local listings in ChicagoPAROLE LexiconC-ORAL-ROM - Integrated Reference Corpora for Spoken Romance LanguagesCorpus LE-PAROLECorpus CINTIL-PREPLEXOSCorpus CINTIL - Corpus Internacional do PortugusEPAC corpusOnline Transcription Tool (OTTO)MUC7TNumber Sense Disambiguation annotations of the Enron CorpusEnron CorpusProppOntostand-off annotation proposed by ISO committee on Language Resources ManagementWebLichtProppian fairy tale Markup Language (PftML)The EDR Electronic DictionaryYahoo!-TREC Question CorpusA system for automatically identifying changes in the semantic orientation of wordsDOOM, Romanian Lexical Data Bases: Inflected and Syllabic Forms DictionariesTREC evalMeSHNLGbAseGoogle Search APIYahoo Search APIFastKwicNPCEditorRen-CECpsRen_CECps 1.0Corpus for Verbal Intelligence EstimationRomanian Generative LexiconPsyCoL Maltese Lexical Corpus (PMLC)Broadcast audioSample of ANC Annotated for IdiomsEMM NewsExplorerEMM NewsBriefSrpFSDGALE Phase 5 Chinese Parallel Word Alignment and Tagging Part 1GALE_Chinese_WA_tagging_guidelinesGALE Phase 4 DevTest Chinese Word Alignment and TaggingGALE_Chinese_alignment_guidelinesGALE Phase 4 Chinese Parallel Word Alignment and Tagging Part 1The Indiana Cooperative Remote Search Task (CReST) CorpusElicited Imitation Test Item Development ToolMulti Layered Hindi Dependency TreebankAnnotation Manual for Evaluation of Agent DialogueThe CMU pronouncing dictionaryICT corpus for speech recognition evaluationWSJ acoustic and language modelsCambridge HTK, HDecodeCMU SLM toolkitSRI Language Modeling ToolkitCMU SphinxNICT Kyoto tour guide dialogue corpusCorpus of Editorials from Newspapers published in Nepal and worldwide and annotation of arguments and opinionsAnnotation scheme or semantic tagset,Enqute Socio-Linguistique Orlans (ESLO)IITKGP Text Emotion CorpusPerugia Corpus (PEC)Dictionary of Italian CollocationsAn annotation scheme of modality in a broad senseA Japanese corpus annotated with labels in a scheme of extended modalityEnglish - Oromo Parallel CorpusCycloneUKWACRelProp Adjective CorpusMultiUNXipFrAG (French Annotation Grammar)Multilingual corpus for Opinion MiningQuaero QA corpusXerox Incremental Parser (XIP)Polish wordnet, plWordNet (S?owosie?)SuperMatrixClarin Web Services at Wroclaw University of TechnologyMulti-APIpepr Framework - Process Engine for Pattern RecognitionEnabling Minority Language Engineering (EMILLE)Q-WordNetJapanese to English MT rules for Multiword Functional ExpressionsJapanese hierarchical Ontology for Multiword Functional Expressions TsutsujiMeaning-Text Theory (MTT)Corpus of Meaning-Text Structures (CoMTeS)context sensitive variant dictionarySextantPerLexSpecies2000Evaluation des Systmes de Transcription enrichie d'missions Radiophoniques - ESTERBREFThe Nijmegen Corpus of Casual French - NCCFrBulgarian FrameNetCMU Pronunciation DictionarySUPPLEMuNPExMiniParPAXEnglish-Latvian localization TMEMEA - European Medicines Agency documentsLatvian news corpusLeipzig Corpora CollectionThe DGT Multilingual Translation Memory of the Acquis Communautaire: DGT-TMConnexor parserSORA corpusPAROLE sottinsiemeASC-ITPAROLE-SIMPLE-CLIPSCPA-It Italian Pattern DictionaryLT World OntologyHeart of GoldCorpus of Arabic SpeechAsian WordNetMultilingual voice creation toolkitLT4eL terminological lexicons in IT domainLT4eL ontology in IT domainLT4eL multilingual corpora in IT domainTurin University TreebankEuroparl Parallel CorpusAlpino TreebankAriadne Corpus Management SystemMaNaLaGeoQueries250RestQueries250extSVM-lighT-TK 1.2RestQueries250GeoQueries250extGerman Idiomatic PNV-Triples in Context (GIPIC)Duden 11 / Redewendungen: Wrterbuch der deutschen IdiomatikFrankfurter Allgemeine Zeitung, FAZArabic Tree BankRelExStanford English grammatical relation extraction utilityStanford part-of-speech taggerCharniak and Charniak Johnson ParserTripAdvisor Data SetWordNet-AffectWordNet DomainsSlashdot Comments CorpusHamshahriBijankhanYamCha: Yet Another Multipurpose CHunk AnnotatorSemantic Case Frames of EnglishA part-of-speech tagger for EnglishHunPosTokenizerTwinityGossip ontology and celebrity DBfurniture ontologySPAS (Structure and Point Annotation of Stories)Dutch corpus for abbreviation detection and resolutionCorpusSearch2Syntax-oriented corpus of Portuguese Dialects - CORDIAL-SINSimple Parser for HindiA Multi-layered/Multi-representational Treebank for HindiCornerstonePropbank frameset filesBrandeis Annotation Tool (BAT)Spanish SpeeCon databaseOpenLogosReEscreveReWriterPort4NooJEng4NooJNooJ20 Newsgroups Data SetANC Manually Annotated Sub-corpus (MASC)A text corpus annotated with usage informationLearning Based JavaProPOSEC: a Prosody and POS annotated Spoken English CorpusDependency based Transfer rulesThe Ellogon language engineering platformAnnotation tool for bilingual aligned corporaeg-GRIDS+Annotation Schema for Collocation Errors in Learner CorporaGALE 5W Distillation EvaluationCROVALLEXItalian BARTShabdanjaliDaniel PipesIIIT-TidesSRILM - The SRI Language Modeling ToolkitAddicterAgriculture domain parallel corpusEmilleCICC Indonesian Basic DictionaryGMA (Geometric Mapping and Alignment)Belgisch Staatsblad corpusVerb Lexicon for Second Language LearnersC-Comparator v0.23MIRE based shallow parserMorphosyntactically annotated Greek corpusGreek Dependency TreebankILSP Dependency ParserILSP LemmatiserILSP Text Simplification toolILSP Term ExtractorILSP ChunkerTimeEL corpusILSP FBT TaggerTimeELperson name ontologyNew Stuttgart Radio News CorpusTREC09_ChatArabic VerbnetMinimaPraat version 5.1TIMITHidden Markov Model ToolkitPrague LabellerCelex EPW (English pronunciations)Kachna L1/L2 Picture Replication CorpusSentiWordNet 3.0KORAIS speech databaseGerManCParallel text-image french news corpusFrameNet to WordNet mappingAdaptation of the David Chiang's (2000) STIG parser (eg. Hybrid)POETICON CORPUSWorld Wide Arabic corpusBank of Russian Constructions and ValenciesItalian TimeBankPAROLE-SIMPLE-CLIPS PISAAnCoraARTiFactOZayaPersian Linguistic Database (PLDB)FarsNetSTeP1PeykarehTranslated Wikipedia InfoboxesFairy tale corpusThe ABLE biodiversity corpusThe German-Russian Parallel Corpus of Sigmund Freud’s „The Interpretation of Dreams“MindNetGalician speech corpusGalician lexiconTAUS Data Association TM PoolLinguisticaC99Usability Guidelines for Annotation ToolsQuranyDiscourse Graph BankArabic WikipediaMINELexArabic WordNetLDC2007T23 GALE Phase 1 Chinese Broadcast News Parallel TextLDC transaltion guidelineseXtended WordFrameNetCorpora of Corpus FactoryVERTLASSYGENIA CorpusGENIA OntologyALeSKo: Annotiertes LernerSprachenKorpus (annotated learner language corpus)SALSA CorpusMaJoRussian Positional TagsetBlogBusterEllogonSTeP-1-TokenizerSTeP-1- morpho analyszerSTeP-1-POS TaggerCzech Web CorpusCzech National CorpusError-Annotated German Learner Corpus (EAGLE)Lingenio Corpus ToolA2STSense Folder CorpusChinese menu corpusJAPE EditorArabic TreebankQuaero NE patent corpusTSR CorpusChungdahm English Learner CorpusMIT FlightBrowser CorpusMIT Address CorpusWami ToolkitConstrative Lexical Evaluation of Machine TranslationN-codePOS TaggerFinite State TokenizerGold Standard POS Tagged CorpusDIINAR.1Delicious datasetLT4el ontology on computingCorpus of naturally-occurring corrections and paraphrases from Wikipedia's revision historyGRISP (General Research Insight in Scientific and technical Publications)Bilingual dictionaryCLIC Corpus della Lingua Italiana ContemporaneaRACAI web serviceCTLJ-ServerEDRCroatia Weekly 100 kw Corpus (CW100)CroTag Morphosyntactic TaggerScheherazadeLatent Semantic Analysis WebsiteMetricsMATR08 development dataJava WordNet::SimilarityREBECAWoordenboek van de Drentse dialectenPKUtreebankNICT_JC_SPNIKKEI_BPJurisdicCompanyMCommunicatorSpatial Annotation SchemeBase ConceptsOntoWordNetCore WordNetDOLCE Foundational ontologyDOLCEIntelliTextRoZPPr.A.Ti.D.EuroTermBankLemmaldIceTaggerIceParserLDC Data Exploration ToolkitLDC MADCAT Management Web AppLDC Word Alignment ToolThe Web Col FrameworkLDC Machine Reading Annotation ToolBase de datos de verbos, alternancias de ditesis y esquemas sintcticos del espaol (ADESSE)MPC: A Multi-Party Chat Corpus for Modeling Social Phenomena in DiscourseMachine Reading P1 NFL Scoring Training Data (LDC2009E112)Machine Reading P1 IC Training Data (LDC2010E07)TAC 2009 KBP Evaluation Entity Linking ListTAC 2009 KBP Evaluation Source DataEster2FrameNet-Wordnet DetourVerbaLexCG treebankAutoTagTCGCorpus of Czech sentences with manually annotated clause boundariesThe Quran and Tafsir CorpusbibleSyntactic lexicon of Arabic verbsAmarakoshaCollection of Croatian Financial TextsHansard, HLT evaluation dataSystranGTMCross-Corpus ModelOALNOUN COMPOUNDS IN CZECH, ENGLISH AND ZULUkddo1kdd09cma1The PIT Corpus of German Multi-Party DialoguesFipsCoTo be definedTermExtractorWebcorpEcoLexiconOrthographic Agreement's Knowledge BaseHuman Language Tehcnology Virtual OrganizationAVE evaluation collectionsSpecies 2000wordnets in various languagesJubileeACE 2003MUC6 Annotationspair-hmm-translitGMTK-DBN-transliteration-model-scriptsJRC Quotes Collection for Sentiment AnalysisDicionrio AbertoNuance ASR/NLU GrammarDiscourse ontologyGuidelines for Caption Annotation: Notes for annotators on how to identify and ground toponym expressions in captions.KnowtatorTRIPOD Corpus of Annotated Image CaptionsANNALIST: ANNotation ALIgnment and Scoring ToolToponym Ontology Geocoding Service (TOGS)Stuttgart-Tbingen Tagset of GermanTbingen Treebank of Written German (TBa-D/Z)RPM2 Summarization and Sentence Compression CorporaGermanPolarityCluesOnline Database of Interlinear text (ODIN)MASCWorldmapperEnt2WikiL3MorphoOlympus NLU Evaluation FrameworkOntoTag's abstract architectureConnexor's FDG ParserLACELL's POS taggerBitext's DataLexicaEAGLES Recommendations for the morphosyntactic and the syntactic annotation of corporaMultilingual Question-Answer Pair CorpusMetaphorDecoOwlExporterMutation Impact TaggerMutation Miner OntologyGEMSDIPSMarathi WordnetPrinceton WordnetJWPL TimeMachine6-monthly snapshots of WikipediaUTD-MotionEventLanguage Technology Resource CenterPattern Dictionary of English Verbs (PDEV)PUMAWQueryPolNet-Polish WordnetCorpus brut amazigheCorTradA Gesture Analysis and Modeling Tool for ANVIL (GAnTool)ACLP American English Messaging Lexiconthe FreeTalk Conversation CorpusThe ESP_C CorpusTest set: TREC 10 questionsAnnotation guidelines for CoNLLTraining set 5(5500 labeled questions)NOMCO multimodal Nordic corpusCorpus of Spontaneous JapaneseBalanced Corpus of Contemporary Written JapaneseArabOrth lexiconCatib version of the Penn Arabic Treebank part 3 v3.1HornMorphoLDC Arabic-English parallel corpusAnCoraPipeArabic-English Parallel Word Aligned Treebank CorpusDan Bikel Multilingual Statistical Parsing EngineLDC2005T20Acquis , UNs , Meedan , LDC2004T17ALGASDList of questions and their answersArabic-English parallel corporaMoses Phrase-based Statistical Machine Translation systemArabic PropbankDialectal ArabicMILA Hebrew CorpusDialectal Arabic ResourcesKawakibCWBMila morphologial analyzerResource grammar in GFARALEXThe Essex Arabic Summaries Corpus (EASC)DIINARMaltParsrNext Generation Localisation Process MapCorpus IDVIULA Text HandlerBrills TaggerFpgrowth Algorithm ImplementationDANTEDanNet , Arab WordNetVocon3200 BasqueZT CorpusaAnHitzDlgMatxinElezkariAhoTTSTANL (Text Analytics for Natural Language)UIMAWordFreakOpenNLP ToolsAWAdbGlozzannie-rdfannie-alphachunking-synaf-enmaf-enTEI (Text Encoding Initiative)IFrameILCI corporaOpen-Content Text CorpusFreeDictOntologies of Linguistic Annotation (OLiA ontologies)RevLemISOcat Data Category Registry (DCR)LIR/Lexical Information RepositoryMLIF/MultiLingual Information FrameworkISOCatSALTGrAFFSR - Feature Structure Representation (ISO 24610-1)ISOcat Data Category RegistryLEXUS, ToolboxXLIFFISO DIS 24612 Language resource management - Lingusitic annotation framework (LAF) and other ISO standardsLwazi TTS corpusLwazi ASR corpusIgbo CorpusLuo Part-of-Speech TaggerParallel CorpusRules for annotating VPs of Northern SothoLanguage Identification for eleven South African LanguagesSwahili-English parallel corpusCTexT Alignment InterfaceParallel Text Corpora for 3 SA language pairsANERcorpSYNERGYHelsinki Corpus of SwahiliKamusi ProjectUIUC Learning Baed Java (LBJ) Named Entity TaggerDagbani CorpusG?k?y? CorpusLwazi ASR CorporaLDOS-PerAff-1Corpus of negociation between users and virtual charactersOPENCV Processing and Java LibraryAnvilOpenCVDYNEMO: A corpus of dynamic and spontaneous emotional facial expressionsD64 Multimodal Conversational CorpusNordic multimodal Corpus (NOMCO)The USC CreativeIT DatabaseSpeech & Prosody SegmentationAudio-Visual Corpus (3D Faces and Speech)3D Face TrackerDigital Replay System (DRS)TimelineCIDCALLAS gesture expressivity corpusInSight InteractionUTEP-ICT Cross-Cultural Multiparty Multimodal Dialog CorpusDOMESemaine DatabaseSAL DatabaseEmomultidisciplinary medical meetingsFilMED - Filipino Multimodal Emotion DatabaseHuComTech Multimodal (Audio-visual) DatabaseSaGATKKVICLODanPASS dialoguesHead Pose and Eye Gaze dataset HPEGMozilla Semantic DesktopArguMeetDUC 2007 DatasetMobySNOMED CTConcord-EDOthersRPAH-ICUDisease and Adverse Effect CorpusPubMed Stopword ListChinese Medical Subject HeadingsLemur ToolkitTREC Genomics 2006 and 2007谷歌金山词霸2.0 (Google and Kingsoft Dictionary 2.0)GENIA EventAnonimised patient recordsBody part ontologyBulgarian DictionaryColorado Richly Annotated Full TextEstonian WordNetBalkaNetSPSSSALDOThe Specialist LexiconelexikoGensimCzech Digital Mathematics Library, DML-CZMARF and its ApplicationswoofDKProRussian WordNet, Russian WordNet GridJavadoc doclet corpus generatio toolBehemothcTAKESMachine Learning for Language Toolkit (MALLET)UIMASTJULIE Labs UIMA Collection Reader for WIKIPEDIAAutomatic Annotator softwareService-Finder datauimaSolrCASCorpusPediaCRPCForeign Language Examination Corpus of the University of Warsaw (FLEC UW)Research Artcle CorpusukWaC + itWaCCross Language Translation and RetrievalChinese and English Parallel Corpus Extracted from Comparable PatentsWortschatz Universitt Leipzig - Corpora and Language StatisticsComparable Corpora for EU LanguagesACCURAT Initial Comparable CorporaIrish Sign LanguageARPToolkitSpanish-LSE corpusSign Language Pose Estimation 3D Parse TreesRWTH Phoenix Weather ForecastAVATecH databaseAVATecH ApplicationSigns of Ireland CorpusItalian Sign Language CorpusSignCom corpusOnno CrasbornAmerican Sign Language Lexicon Video DatasetDGS-CorpusCopyCat CorpusData Collection PlatformSign verification systemLabel ToolNational Center for Sign Language and Gesture Rescoures (NCSLGR) corpus, Boston UniveristyAuslan CorpusCUNY ASL Motion-Capture CorpusRussian Sign Language Explanatory DictionaryThe basic dictionary of FinSL example text corpus (Suvi)MoodleILexJASigningSILADON_CZSigning footage recorded from TV with simultaneously broadcasted subtitlesOCELLESRWTH-BOSTON-400CatCGCatalan Sign Language Corpus on the weather report domainNIST score (script: mteval-v13a.pl)am_toolsCorpus NGTJulie Hochgesangdasher for sign writingDicta-Sign : API for pluginsGSL Classifier CorpusSIGNUM DatabaseAmerican Sign Language SynthesizerAugmented Reality Authoring Tool with Online Information Integration PlatformQuestion Analyzer for RomanianThe Eurogene corpusThe Eurogene ontology of human geneticsMyanmar Word TokenizerMyanmar Name Entity RecogniserSense Tagged CorpusEnglish-Myanmar Parallel Text alignerCollective Named EntitiesName Matching Evaluation FrameworkSxPipe/NP2evalSxPipe/NPENCOREBART Anaphora Resolution ToolkitACE-2 Version 1.0LBJ Coreferene PackageMIX1Asymmetric Threat Response and Analysis Program (ATRAP)BoB dialogue logsYahoo! Answers Comprehensive Questions and Answers corpus [version 1.0]Question Answer Sentence PairExcite query logWikipedia article namesAnnotated Webclopedia question collectionTREC QA questionsmulti-XEN1MSzeged CorpusHungarian Webcorpusmorphdb.hunamed entity taggersRipperFodina Patent TextsChoose the Right Word Formality PairsHunmorphFormalism for Morphological and Phonological GrammarMPI Language Resource ArchiveStephen Jay Gould ''Leonardo's Mountain of Clams and the diet of Worms.'BAMDESSpoken Turkish CorpusLeffeeuromobil 2LexkitGeorgian Ontological Semantics LexiconNetgraphIndex Thomisticus Valency LexiconGuidelines for the Syntactic Annotation of Latin TreebanksAnnotations at analytical level: Instructions for annotatorsHFST finite state speller evaluationWikipedia dump and scriptsOpen source morphology for Finnish (omorfi)Nganasan LexiconTypeCraftSoraLexHas not been named yetTnTCombiTaggerfnTBLBidirectional taggerCorpusTaggerThe Icelandic Frequency DictionaryIceNLPUrdu data annotated for EmotionsmwttoolkitGreek MWE dictionaryA Georgian-Russian-English-German Valency Lexicon for Natural Language ProcessingJRC-AcquisSouth-East European TimesRomnaian syllables data baseLexParJRC-ACQFrRoFRROAUFDGT Translation MemoryHungarian WordNetMagyar rtelemez? Kzisztr / Hungarian Explanatory DictionaryFrequency Dictionary of Verb Phrase ConstructionsLegalPrivacyOntoPopulateAnnotated Intellectual Property Claims in Complaint DocumentsSwiss statutes and regulations (semantically analyzed)Controlled Legal German (CLG)German court decision corpusLegal Case Factors ExtractionFAU AiboopenEARSmartKomSALVera-am-MittagMMI Facial Expression DatabaseSpeech In Minimal Invasive Surgery (SIMIS)EmoVoxNao-ChildrenIDVSEMAINE corpus headnods and shakesRECC-Rovereto Emotion and Cooperation CorpusVocal Expressions of Nineteen Emotions across Cultures (VENEC)Mind Reading(no-name)DoorsWordNet-Affect-OCCDigg datasetBBC News forums data setEmotional Narratives CorpusMaltOptimizerPostech Learner Corpus (POLC)a Collection of Translation Error-Annotated Corpora (Terra)A Large-Scale Unified Lexical-Semantic Resource (UBY)A Library for Large Linear Classification (LIBLINEAR 1.51)A Multi-layered Reference Corpus for German Sentiment Analysis (MLSA)A Multiparty Multi-Lingual Chat Corpus for Modeling Social Phenomena in Language (MMPC)A new French Meta Grammar (frmg)An Open Toolkit for Automatic Machine Translation (Meta-)Evaluation (Asiya)an XML Based System For Corpora Development (CLaRK)Annotation Tool for Concepts and Relations (Recon) ???Arabic GigawordArabic Treebank Part 3 v 3.2Arabic-English Parallel Aligned Treebanksasa-sentiment-analysisAtlante Sintattico d'Italia, Syntactic Atlas of Italy (ASIt)Austrian Academy Corpus (AAC)BAS Bavarian Archive for Speech Signals Pronunciation Lexicon PHONOLEXBasic dictionary of FinSL example text corpus (Suvi)BRIGHAM YOUNG UNIVERSITY British National Corpus (BYU-BNC)CALBC (Collaborative Annotation of a Large Biomedical Corpus) corporaChinese Spell Checking DatasetComponent MetaData Infrastructure (CMDI) Information PageCoreference Resolution Engine (Reconcile)Corpora from the web (COW)Corpus DEFT (Dfi Fouille de Textes)Corpus Internacional do Portugus (CINTIL) PropbankCroatian Inflectional Lexicon (MOLEX) Cross Language Entity Linking in 21 Languages (XLEL-21)Dependency Part of BulTreeBank (BulTreeBank-DP)Dictionnaire de valence des verbes franais (Dicovalence 2)Dictionnaire fondamental de l'informatique et de l'Internet (DiCoInfo)Drugs@FDA Data FilesEmotional Speech Database for Basque (Ahoemo3)English Child Language Data Exchange System (CHILDES) Verb Construction DatabaseEuconstFlexible Error Annotation Tool (feat)Format for Linguistic Annotation (FoLia)Free Text File Merging Tool (TXTcollector)genchal-repositoryGeneral Ontology for Linguistic Description (GOLD)German Web Corpus (DeWaC)Glossary of International Relations (GLOSSIR)Helsinki Finite-State Technology (HFST) toolsInternational Workshop on Spoken Language Translation (IWSLT) 2011 parallel TED CorpusJava library for detecting Multi-Word Expressions (jMWE)Joint Research Centre (JRC) Eurovoc Indexer JEXKTH eXtract Corpus (kthxc)Lancaster-Oslo/Bergen (LOB) Corpuslanguage generation evaluation toolkit (lg-eval)Large Bilingual Speech Database for Synthesis (Ahosyn)Learner Corpus of HungarianLexicon Enhancement via the GOLD Ontology (LEGO)LGTaggerMAZEA-WebMITRE Dialogue Kit (MIDIKI)Modeling linguistic corpora in OWL/DL (POWLA)Modern Arabic Representative Corpus 2000 (MARC-2000)MorphoAdornername recognizerMultilingual Turin University Treebank (ParTUT)Multilingual UN Parallel Text 20002009 (MultiUN)Multi-perspective question answering (MPQA) sentiment lexiconNear-Identity Relations for Coreference (NIdent) CA Official Europarl test set from WMT 2008pantera-taggerParallel Corpora Collector (PaCo2)Parameterized & Annotated CMU Let's Go Database (LEGO)Persian Treebank (PerTreeBank)Perspicuous and Adjustable Links Annotator (PALinkA)PORT-MEDIA DomainRecognising Textual Entailment (RTE) 2 Test SetReference Corpus of Contemporary Portuguese (CRPC) Modality SampleSerbian morphological electronic dictionary (SrpRec)Serbian Wordnet (SrpWN)SignWriting improved fast transcriber (SWift)Similarity Metric Library (SimMetrics)SMI Remote Eye Tracking DeviceSpeech Database in Basque for Synthesis and Voice Conversion (AhoSpeakers)SPeech Phonetization Alignment and Syllabification (SPPAS)STEVIN Nederlandstalig Referentiecorpus (SoNaR)Swedish Kelly listSyntactic lexicon for French (Lefff)The AQUAINT Corpus of English News Textthe BiLingual Annotator/Annotation/Analysis Support Tool (Blast)The Concisus Corpus of Event SummariesThe DGT Multilingual Translation Memory of the Acquis Communautaire (DGT-TM)The Freiburg - LOB Corpus of British English (FLOB)the open parallel corpus (OPUS) UK PubMed Central (UKPMC)Verb Pattern Sample, 30 English verbs (VPS-30-En)W2C - Web To CorpusWebCrawlerWeb-Harvested Corpus Annotated with GermaNet Senses (WebCAGe)Word clAss taGGER (WAGGER)WordNet Libre du Franais (WOLF)Sentiment QuizEmpoli e dintorniProposition databaseAssociation NormsYamabukiThe Database of Icelandic Inflection [Beygingarlsing slensks ntmamls]InputlogPimorfoCasual English Generation Phoneme DatabaseInfomap NLP SoftwareHTTrack Website CopierRNgram Statistics Package (NSP)HTMLAsText v1.11GSplit 3ABBYY FineReader 10TermoStat Web 3.0metu sabanc? Turkish Dependency Treebank - addtional annotationLeXimirWordNet AtlasSUTimeERDOEdit PluseXtended WordNet DomainsTurkish Word SketchesDutchSemCorCallistoFrench corpus of event nominalsOldpress CorpusLFG Grammar of PolishSemantic TypesDependency-Parsed FrameNet CorpusA Reference Dependency Bank for Analyzing Complex PredicatesBoundary-Annotated Qur'anAncora-3LB-POSLAST MINUTEYahoo Chiebukuro Corpus Humor IndexAnnotated UGC corpus for normalizationWeSearch Data Collection (WDC)SMALLWorldsMandarin Chinese GrammarCroatian Dependency TreebankInterCorp - a multilingual parallel corpusDSimAWATIFTARSQI ToolkitLGLex 3.3Aesops fables and Andrew Lang fairy tales collectionColloquial Egyptian ArabicFAUST Feedback AnnotationOral History Annotation ToolAnnotations for progressive aspect sentences in the spoken section of the BNCA Cross-Lingual Dictionary for English Wikipedia ConceptsSocial Constructs - Pursuit of powerCapek: an annotation editor for schoolchildrenSubjectivity Lexicon for Dutch AdjectivesGerman Food Relation DatabaseChinese Whispers Paraphrase Corpus (CWPC)FAUST quality assessmentsTIGER Treebank (release August 2007)German Parliament SessionsSemSimANALECPrague Czech English Dependency Treebank 2.0PathoJenFTW multi-speaker synchronous acoustic and 3D facial marker data in Austrian GermanModeling Textual Organization (MTO) CorpusCatalan WebcorpusNTCIR-9 SpokenDoc test collectionVerb Lexicon and Event DurationsJAKOB-LexikonEmail corpus annotated with social power relationsDramaBank corpus and Scheherazade annotation toolA Sentence Database for Chinese Reference GrammarPrimeCoefBlademistress corpusDirectional corpora in EuroparlA Universal Part-of-Speech TagsetTAC 2009 KBP Gold Standard Entity Linking Entity Type listEnClueWebArabic Treebank (ATB) Part 3 v 3.2Casual English Conversion System DatabaseTrilingual Parallel (Arabic-Spanish-English)Corpus-SampleISO 24617-2 Semantic annotation framework, Part 2: Dialogue actsPropbank-BrhandAlignedCorporaJPENTWSI 2.0: Turk Bootstrap Word Sense InventoryDiachronic German Corpus TBa-D/DCArabic Subcategorization Frames in the Arabic TreebankYADACTED-LIUMCLASSYN text type-specific corporaCorpus of Pronominal Anaphora of the QuranPoliMorfKPWrRussian Automotive CorpusThe Herme Database of Spontaneous Multimodal Human-Robot DialoguesdeepKnowNetCATGerman political news corpusMulti-perspective question answering corpus (MPQA)Birkbeck spelling error corpusSpanish C-ORAL-ROM-ELEQurSimSFU Review corpusPortal Lngua PortuguesaRWTH-PHOENIX-Weather CorpusRIDIRE-CPICLIMBThe Hindi PropBankTRIS CorpusGrammatical Framework (GF) Resource Grammar LibraryMultilingual Central Repository version 3.0Quranic Arabic CorpusDECODAOntology of Italian LinguisticsSquoia TreebankLinguistic Linked Open Data cloudNgramQueryAhoDiarizeHal Scientific Paper Corporaapertium-kircSRIDENTIC CorpusWordNet mapping to Kyoto OntologyAnnotation Discursive (ANNODIS) CorpusFinnish WordNet (FinnWordNet)Connective annotation over EN/FR EuroparlLatvian resource grammar in Grammatical FrameworkIT-PANACEA SCF test suiteCorpus of Sentences for User Interaction in Pronunciation Learning SystemsMicrosoft Researc Lab India's Hindi-ENglish Transliterated Song Lyric DataThe REX corporaCARDS-FLYMo Piu data baseMultitradPolish Sejm CorpusIntegrated Reference Corpora for Spoken Romance Languages (C-ORAL-ROM)Chronolines CorpusWebAnnotatorDicoInfoGene renaming corpusapertium-es-anROMBACNoor Book CorpusGold Standard for English human nounsJWNLSimpleULexThe Twins Corpus of Museum Visitor Questionsquestioncorpus.ptInternational Corpus of Learner EnglishSwedish Framenet (SweFN)Large Scale Syntactic Annotation of written Dutch (LASSY)Quaero Terminology Extraction Evaluation Patents CorpusTrsor de la Langue Franaise informatis (TLFi)FixISS databaseCorpus of Computer-mediated Communication in Hindi (CO3H)Le Petit Prince in UNLPORTMEDIA LangBulgarian X-Language Parallel CorpusData repository of spontaneous spoken CzechCroatian Valency Lexicon of Verbs (CROVALLEX)goo300k corpus of historical SloveneETAPEBulgarian National Reference CorpusRomanian TimeBank corpusMIMEstonian reference CorpusBrandPittMULTIPHONIAMASC word sense corpusTAC 2010 KBP Evaluation Source DataExample Database of Japanese Multiword Functional ExpressionsUniDic-2.1.0RELcat a Relation RegistryAttribution DatabaseEllogon language engineering platformFirst Certificate in English (FCE) exams of Cambridge Learner Corpus (CLC)Dot object predication gold standardSimcoach speech synthesis evaluation corpusAustralian National CorpusInforexLexItDutch Parallel Corpus (DPC) (subpart)Irony detectionText::Perfide::BookSync (Perl module)TIMENTajik-Farsi Persian Transliteration SystemStockholm MULtilingual TReebank (SMULTRON)Suffix Tree Language ModelGisterTreebank.infoa Beautiful Anaphora Resolution Toolkit (BART)EMO_EventsNTCIR-7 Patent Mining dataXFST Murrinh-Patha morphological analyzerRembrandt frameworkProsomarkerAnnotation of instructional textsNeoTagPersian Part of Speech TaggerTurkish Paraphrase Corpustexrex web corpus toolsuima-commonCorpus of Indefinite UsesSeCo-600AledaStockholm EPR Clinical Entity CorpusGerman Logical Metonymy DatabaseAutomotive Repair OrdersWES baseThe LarKC life sciences datasetGlottolog/LangdocRomanian WordnetAnnotated Film Dialogue CorpusEPIC Twitter NLP Development DatasetChiba three-party conversation corpusPiTuLetsMT!HunOrQuaero Football CorpusALLEGRAmate-toolsSciTexTimeBankPTEst Rpublican Parsed CorporaLyrics&NotesThe Nordic Dialect CorpusSzegedParalellFXDEGELS1MLTaggerApertium Spanish Monolingual Dictionary from Spanish-Catalan language pairVirtual Language ObservatoryAnItaPolarity lexicons in SpanishConanDoyle-neg corpusSpeech data corpus for verbal intelligence estimationEstonian Multiparty dialoguesSimplextBaltic Language Named Entity Recognition (NER) corpusNomcoI3Media multilingual emotional speech corpusPOLEXPESHeidelTimeAjkaAn Annotated, Multilingual Parallel Corpus for Hybrid Machine TranslationLDC User InterfaceCzech Web Corpus 2011UniDic for Early Middle Japanesetexto4sciencePolish Multimodal CorpusDeCourANVIL toolPETSemeval-2010 Japanese WSD task datasetPRONTO Firefighter CorpusQuestion Pairs from WikiAnswersmaui-indexerPEXACCSentiStrengthNKI-CCRT CorpusEnriched GENIA Event Corpusiula2standoffMinho Quotation ResourceCultural Heritage item - Wikipedia matchGerNEDB3-toolCorpus on Debate and DeliberationJoint Research Centre JRC-Acquis German-EnglishCzech-English Parallel Corpus (CzEng) 1.0Divergence Measure Tool (DMT)Customized Europarl corpusCLTC corpusMetadata editorLarge Dataset of French-English SMT Output CorrectionsC-ORAL-BRASIL IWiktionary lexical networkHOO Evaluation FrameworkCroatian collocations gold setPolish WordNet (plWordNet)Elhuyar Basque-Chinese predictionaryWOLFT2T3Phase One data release for Blizzard 2012Arabic Wordlist for SpellcheckingTamil Dependency Treebank (TamilTB)Netlog Corpus and Chatty SubcorpusBritish National Corpus (BNC), spokenKIT Lecture Corpus for Speech TranslationAn Open Source Persian Computational GrammarISABASE - 2Spatial Containment Relations Between EventsYet Another Term Extractor (YATE)CALBC CorporaNational Center for Sign Language and Gesture Resources (NCSLGR) corpus, Boston UniversityThe Switchboard CorpusRelaxCorDamascene Colloquial Arabic SpeechAncient Greek Dependency TreebankEuroparl v.6Wikinews Multidocument Summarization Data Collection ToolThe Stanford Parser: A statistical parserTwitter Emotion CorpuseXtensible MetaGrammarTIGERexpSVMlightA Collection of Russian Corpora (University of Leeds)SrpRec - Serbian morphological electronic dictionary (SrpRec)Aligned ConceptNetsStatistical Engish-Myanmar Machine Translation SystemTsinghua Chinese Treebank (TCT)English-Croatian Parallel Corpus (EngCro)Corpus QuaeroPalula CorpusDragon Naturally SpeakingPhonologie du Franais contemporainTemporal Entailment Rules for RDFS and OWL-Horst dialectAnnotated Corpus of Automotive Engineeringnews articles on epidemicEvent levelsJapanese Word Dependency CorpusOntoverbetymosJoint Research Centre (JRC) Quotes Collection for Sentiment AnalysisAutomatic annotated training data for temporal slot fillingDaemonetteSDEWACGreek-Chinese Interlinear of the New TestamentGreek POS TaggerNorth American News Text Corpus (LDC95T21)Icelandic Parsed Historical Corpus (IcePaHC)ELRA-W0051Language Similarity TableEmotiWordEnglish-Russian Wiki-dictionaryTUNA-Lex corpusWeb Service Architecture for AlignersAVATecH BAtch eXecutor ABAXLibrary of Natural-Language Representations of Formal RelationsBilingual LexiconJezikovne tehnologijeSPIDIXSweVocOSCARRobertino-game corpusCORIS/CODISLignocellulose CorpusNLPipethe Callhome Mandarin Chinese CorpusTalkBankThe Kalashnikov 2K dependency bankLitRecAssociative Concept Dictionary (ACD) for VerbsEpinions Annotated Reviews DatasetBulgarian Sense Annotated CorpusDutch de/het noun classification script for TiMBL/Frog outputStandford ParserJapanese Corpus of Diverse Document Leads with Anaphoric AnnotationCAMELEON comparable corpusUighur to Chinese Dictionary DatabasehunspellEventMapping.rdfcorpus_shellThe KPG English CorpusYet Another Multipurpose CHunk Annotator (YamCha)Croatian Sentiment LexiconLeuven Arabic DatabasePashto monolingual corpusOld Hungarian CorpusIMAGACT Annotation InfrastructureKinOathMathHindi Discourse Relation BankCroatian Morphological LexiconFips Web ServiceRadziszewski Acedanski Tagging Evaluation MethodTagged Quranic CorporaSETIMESVietnameseNERCroatian CG Morphological Disambiguation RulesSequence-based alignerBiomedical why-question answering corpusThe Portuguese Regional Accent DatabankHausa Internet CorpusMultiword Expressions Toolkit (mwetoolkit)Spoken Corpus of Standard European PortugueseLM/PRUDouble-Edged WordNet (DEWN)ConceptNet 5Multiple-Chinese Translation Part 4 (MTC4) datasetFrench preprocessing OpenNLP ModelsEnron questins and answersAmerican National Corpus (ANC) Manually Annotated Sub-Corpus (MASC)LEXCONNTurkish Dependency ParserCorpus of evolution of designation of events in FrenchAnnotation guidelines for event nominal tagging?wigraNIdent-ENDELAAesops fables timeline annotationsGerman (P)LTAG treebankOpinionFinder Subjecitivity LexiconTiGer2DepCroatian WebcorpusBonsai Dependency Parser v3.2MoCap ToolboxTAC 2010 KBP Evaluation Entity Linking Gold Standard V1.0Ontonotes 4.0 newswire sectionHTML and PDF text and metadata extractorsMorfeusz SGJPCELCT CorpusModal sense corpusC-ORAL-CHINAKyoto University Case FramesSvenskt associationslexikon (SALDO)OpenSubtitlesSpanish Learner Language Oral Corpora (SPLLOC)CeramicaSpanish Aragonese DBpedia Abstract CorpusGold Standard for English abstract nounsLang-8 Learner CorpusGrETEL (Greedy Extraction of Trees for Empirical Linguistics)Quaero Terminology Extraction Evaluation Abstracts CorpusDependency Shift Reduce parser (DeSR)Extented DOLCE OntologySYNC3 Collaborative Annotation ToolIMS Open Corpus WorkbenchChinese-English Parallel Aligned TreebanksUser generated content out of automotive forumsIrish Dependency TreebankBaboukPledari GrondApertium Catalan Monolingual Dictionary from Spanish-Catalan language pairA Corpus of Spontaneous Multi-party Conversation in Bosnian Serbo-Croatian and British EnglishTartu Multimodal DataCARTOLACrisis Management CorpusElhuyar Basque-English dictionaryT2T3 TIMEX3 corporaInternational Workshop on Spoken Language TranslationBuckwalter Arabic Morphological AnalyzerFrown corpusPre-processed test corpus of RussianSpanish WikipediaCorpus Analysis and Validation for TimeML (CAVaT)Estonian Morphological DisambiguatorWebCAGePROIELIllinois Named Entity TaggerMyanmar-English-Myanmar DictionaryCorpus de Franais Parl ParisienDisjoint EventsModern Greek Spontaneous Essay TextAVATecH wrappers and utility recognizersMicrosoft Research Paraphrase Corpus (MSRP)Electronic Orange BookThe Wenzhou Spoken CorpusChild Language Data Exchange System (CHILDES)TAC 2008 Update SummarizationAligned Annotation ToolKazakh to Chinese Dictionary DatabaseAQUAINT-2TimeBank 1.2WordAlignerHumorWMBTILCI Annotation ToolEuropean Medicines Agency (EMEA) documentsInterrogatio AnselmiOpen American National Corpus (OANC)MorfeuszGerman (P)LTAG lexiconThe New York Times Annotated CorpusCzech WebcorpusEmospeech MPCorpusTAC 2010 KBP Training Entity Linking V2.0SentiSenseMorfologikAnnotated corpus of German political newsC-ORAL-JAPONSVMToolReview corpus annotated for speculation and negationTalbankenMicrosoft Video Description CorpusSpanish Learner Oral CorpusTxtCeramGold Standard for English non-deverbal eventive nounsConverter Freeling 2 DESROntoValence DictionaryPolish Parallel CorporaTimenEvalCWB-treebankIllinois Part Of Speech TaggerApertium Bilingual Dictionary from Spanish-Catalan language pairWikiWarsGermanSentimentDataMdbg Chinese-English dictionaryWapitiLEMLATIMAGACTJava Wikipedia Library (JWPL)Danish Dependency TreebankGramTransSentiment-annotated set of quotationsMyanmar Word SegmentationChoix de Textes de Franais ParlParagraph alignment list (LINA-PAL-1.0)Paraphrase CorpusRobertino-gameAUDIMUSIntra Chunk Dependency ParserKyrgyz to Chinese Dictionary DatabaseFR-TimeBankCroatian CG Morphological TaggerText::NSPDutch WebcorpusLexicon-Grammar tablesTAC 2011 KBP English Evaluation Entity Linking Annotation v1.1Gold Standard for Spanish concrete nounsLEXUSTIPSemuima-connectorsIllinois ChunkerTime4SMSGermanSentiStrengthLexiconSCOLA broadcast newsBasque-Chinese comparable corporaMorphTaggerLexicon totius latinitatisSRI Language Modeling Toolkit (SRILM)Corpus de Rfrence de Franais ParlParagraph Clustering List (LINA-PCL-1.0)ManhattanChinese to Uighur, Kazakh and Kyrgiz Parallel Dictionary DatabaseA2ST-COMPWeighted Lexicon of Event NounsUCS toolkitEnglish (P)LTAG treebankItalian WebcorpusTAC 2011 KBP Cross-lingual Training Entity Linking V1.1Bulgarian Morphological Dictionary XML editorGold Standard for Spanish human nounsWeb 1t 5-gram corpus (1.1)Illinois Named Entity RecognizerTime4SCIInstructional ManualsConnexor Machinese Syntax parserScorerRiTa.WordNet a WordNet library for Java/ProcessingLocalMaxsEnglish (P)LTAG lexiconPolish WebcorpusTAC 2011 KBP Cross-lingual Evaluation Entity Linking Annotation V1.1Gold Standard for Spanish semiotic nounsTimeMLIllinois Coreference ResolverNUS SMS CorpusLDC HUB4FEMA: Help After a DisasterAntconcPropBank (Proposition Bank)Spanish WebcorpusGold Standard for Spanish non-deverbal eventive nounsIllinois Semantic Role LabelerCrowd-Key-phrasesGovernment RecordsContextesNoClonePenn Treebank NP annotationIllinois Wikifierearly Government RecordsA free/open-source marker-driven example-based machine translation system (OpenMaTrEx)A Treebank for Finnish (FinnTreeBank)Automatic Syntactic Analysis for Polish Language (ASA-PL)GNU Linear Programming Kit (GLPK)Joint Research Centre JRC-AcquisPolish Word SketchesSoftware for Clustering High-Dimensional Datasets (CLUTO)Squoia SpellcheckSyntax in Elements of Text (SET)Topic Detection and Tracking (TDT Phase 3)Lin-EBMT^REC+The Survey Zone (SurZe)RACAI TTS Speech SegmentationmyresPLLMpre-test, training, post-test experimental designACE 2004 Multilingual Training CorpusPersian Syntactic Verb Valency LexiconFipsNormalizerRoGERGeneralLexicon_Fr-EnPerson Name Recognition for Alpine TextsIndian Language Part-of-Speech Tagset: HindiPrinciples of Part-of-Speech (POS) Tagging of Indian Language CorporaPhrase Structure GrammarEuroparl v6 French-EnglishKorpusik US II PWr (Small Corpus for WSD)ToTaLePOS tagged Data setOntoGenKashmiri Part of Speech TaggerBLEU/NISTPronunciation Errors from Learners of English Coprus and Annotation (PELECAN)A Chinese-English Code-Switching Speech Database (CECOS)Fisher English Training SpeechDysarthric Speech DatabaseBengali Speech CorpusMongolian speech corpus for speech synthesis (NUM, NITP, NECTEC)Multimodal corpus in multi-party conversationsMalay Emotional Speech Database (MESD)Buckwalter lexiconApertium Spanish monolingual dictionary from the Spanish-Catalan language pairIndiana Cooperative Remote Search Task (CReST) CorpusTask-10 test and training data Semeval 2010Bijankhan CorpusFeauture Vector Set and Tree Forest (SVM-LIGHT-TK 1.2)PRObability-based PROlog-implemented Parser for RObust Grammatical Relation Extraction System (Pro3Gres)Lingua::TreeTaggerCorpora for Named Entity Recognition of Chemical Compounds (IUPAC Corpus)Domain Adaptive Relation Extraction (DARE)HCRC Map Task CorpusIcelandic Frequency DictionaryJoint Research Centre (JRC) NamesMulti-Perspective Question Answering (MPQA) Opinion CorpusOpen Mind Indoor Common Sense (OMICS)Rhetorical Structure Theory Tool (RSTTool)Suggested Upper Merged Ontology (SUMO)Treebank-2 TBa-D/Z Release 7Unified Medical Language System (UMLS)Unified Medical Language System (UMLS) MetathesaurusApertium linguistic data for the Spanish--English language pairApertium linguistic data for the Breton--French language pairApertiumWikipedia Category HierarchySemantic Parser (no specific name)opinion holder predicatesRST Spanish TreebankEmotiBlogControlled Language for Crisis Management (CLCM)KPG English Learner CorpusProject GutenbergEvent LexiconIT-TempeEval-2 Data SetWiki50QuestionBankNLP Resource Metadata Questions Treebank (NLP-QT)Image description CorpusPunjabi Resource GrammarAggregated texts and their semantic representationsAymara LFG GrammarTranslation memories of O?s ar Brezhonegnewstest2008newstest2010Minho Quotation BankPublic Health Opinion Corpus (PHOC)BulTreeBankTBa-D/Z connective annotationMultilingual Named Enity Annotated CorporaNew York Times Annotated corpuswikAPIdiaPAN'10 plagiarism detection corpusi2b2 2010 corpusFrench-Romanian parallel corpusXML model of WikipediaBeygingarlsing slensks ntmamls (BN)Mrku slensk mlheild (MM)Reusable Resources for the Romanian Language (RRRL)Health-related sentiments and opinionsOntology on Narural Sciences and TechnologyWSMT corpus annotated for sentimentMultilingual sentiment dictionariesWiki-Biographic-CorpusMultilingual summary evaluation dataMaupassant: segmented and tagged text into types by XML tagsNewstrain-08CoNLL2000 shared task datasetKDD-D / KDD-T DatasetsCoNLL 2008 Shared Task DataUnsupervised Dependency Parser for English v 1.0ptbconv-3.0SemEval 2010 Coreference Resolution Task CorpusDeSRSubjectivity LexiconTAC 2009 Document SetsArabic Treebank (ATB)MaltParserArabic Syntactic Dependency Parser ModelDocument Understanding Conference (DUC) CorpusNTCIR 2008 Training DataLexical Normalisation Annotations for Short Text Messages (LexNorm)Penn Discourse Treebank ParserLarge Movie Review DatasetMPQA Opinion CorpusEuroparlDiversity in Collective DiscourseQuery Dataset for Email SearchNIST 2009 MT Evaluation SetCrowdsourced translations, edits, and rankings for the 2009 NIST Urdu-to-English datasetCLC FCE DatasetCzech-English Parallel Corpus (CzEng)Stanford ParserUNT Computer Science Short Answer Dataset v 2.0LingPipeBritish National Corpus (BNC)English GigawordMessage Understanding Conference (MUC) 4 Terrorism CorpusKonan-JIEM Learner CorpusEuroparlLanguage Function Analysis Corpus (LFA-11)Baidu Zhidao CorpusBalanced Corpus of Contemporary Written Japanese (BCCWJ)Extended REX-J CorpusCLP2010 Testing Dataset of the Chinese Word Sense Induction TaskKnowledge Base Population Corpus (TAC KBP)Stockholm-Ume Corpus (SUC) 2.0UIUC Question Classification Data (Training set 5)EuroparlEDR Electronic DictionaryTREC Entity 2010a User-Extensible Morphological Analyzer for Japanese (JUMAN)Darpa TIDES Surprise Language DatasetCoNLL 2006 Shared Task DataContact Center DataQuaero Broadcast News Named Entity CorpusPenn Chinese TreebankChinese Treebank 5.0Nihongo Goi TaikeiWordNet 3.0Automatic Statistical SEmantic Role Tagger (ASSERT)Kyoto Text Analysis Toolkit (KyTea)Document Understanding Conference (DUC) 2005 DatasetAdditional Review Datasets (9 products)English GigawordNamed Entity Recognition Corpus from the Fourth International SIGHAN Bakeoff Data SetsNTU Sentiment Dictionary (NTUSD)Kadokawa Ruigo Shin JitenGraph Based Word Sense Disambiguation and Similarity (UKB)NAIST Japanese DictionarySemisupervised Named Entity Recognizer (SemiNER)Collapsed Gibbs Sampling Methods for Topic Models (lda)SnowballMarkov thebeastKyoto University's Case Frame Data 1.0a Library for Support Vector Machines (LIBSVM)Multi-Domain Sentiment Dataset 2.0Subjectivity LexiconNihongo Goi TaikeiMainichi Newspaper DatabaseNew York Times CorpusJava Syntaxico-semantic French Analyser (J-Safran)A Praat script for extacting pitch targets from vocal signals (PENTAtrainer)Augmented Multi-party Interaction (AMI) meeting corpusAugmented Multi-party Interaction (AMI) meeting corpusAURORA Project Database 2.0 - Evaluation PackageAURORA Project Database 2.0 - Evaluation PackageAURORA Project Database 2.0 - Evaluation PackageBABEL Hungarian Speech DatabasesCentre for Spoken Language Understanding (CSLU) Names v1.3CMU Let's Go DataCMU_ARCTIC speech synthesis databasesCzech Speecon databaseExtensible Markup Language for Discourse Annotation (EXMARaLDA) Partitur-EditorHIWIRE (Human Input that Works In Real Environments) databaseInstitute for Signal Processing (ISIP) environmental noise signalsItalian SpeechDat(II) Modular Architecture for Research on speech sYnthesis Text-to-Speech System (MARY TTS)Quaero named entity corporaSpanish SpeechDat(II)Swiss-French SpeechDat(II)Swiss-German SpeechDat(II)The EMIME Mandarin/English Bilingual DatabaseTIMIT Acoustic-Phonetic Continuous Speech CorpusWSJCAM0 Cambridge Read Newsa Collection of Translation Error-Annotated Corpora (Terra)A Large-Scale Unified Lexical-Semantic Resource (UBY)A Large-Scale Unified Lexical-Semantic Resource (UBY)A Library for Large Linear Classification (LIBLINEAR 1.51)A Multi-layered Reference Corpus for German Sentiment Analysis (MLSA)A Multiparty Multi-Lingual Chat Corpus for Modeling Social Phenomena in Language (MMPC)A new French Meta Grammar (frmg)An Open Toolkit for Automatic Machine Translation (Meta-)Evaluation (Asiya)an XML Based System For Corpora Development (CLaRK)an XML Based System For Corpora Development (CLaRK)Annotation Tool for Concepts and Relations (Recon) ???Arabic GigawordArabic Treebank (ATB)Arabic Treebank Part 3 v 3.2Arabic-English Parallel Aligned Treebanksasa-sentiment-analysisAtlante Sintattico d'Italia, Syntactic Atlas of Italy (ASIt)Austrian Academy Corpus (AAC)BAS Bavarian Archive for Speech Signals Pronunciation Lexicon PHONOLEXBasic dictionary of FinSL example text corpus (Suvi)BRIGHAM YOUNG UNIVERSITY British National Corpus (BYU-BNC)CALBC (Collaborative Annotation of a Large Biomedical Corpus) corporaComponent MetaData Infrastructure (CMDI) Information PageCoreference Resolution Engine (Reconcile)Corpora from the web (COW)Corpus DEFT (Dfi Fouille de Textes)Corpus Internacional do Portugus (CINTIL) PropbankCroatian Inflectional Lexicon (MOLEX) Cross Language Entity Linking in 21 Languages (XLEL-21)Dependency Part of BulTreeBank (BulTreeBank-DP)Dependency Part of BulTreeBank (BulTreeBank-DP)Dictionnaire de valence des verbes franais (Dicovalence 2)Dictionnaire fondamental de l'informatique et de l'Internet (DiCoInfo)Drugs@FDA Data FilesEmotional Speech Database for Basque (Ahoemo3)English Child Language Data Exchange System (CHILDES) Verb Construction DatabaseEnglish GigawordFlexible Error Annotation Tool (feat)Format for Linguistic Annotation (FoLia)Free Text File Merging Tool (TXTcollector)genchal-repositoryGeneral Ontology for Linguistic Description (GOLD)German Web Corpus (DeWaC)Glossary of International Relations (GLOSSIR)Helsinki Finite-State Technology (HFST) toolsInternational Workshop on Spoken Language Translation (IWSLT) 2011 parallel TED CorpusJava library for detecting Multi-Word Expressions (jMWE)Joint Research Centre (JRC) Eurovoc Indexer JEXKTH eXtract Corpus (kthxc)Lancaster-Oslo/Bergen (LOB) Corpuslanguage generation evaluation toolkit (lg-eval)Large Bilingual Speech Database for Synthesis (Ahosyn)Learner Corpus of HungarianLexicon Enhancement via the GOLD Ontology (LEGO)MAZEA-WebMITRE Dialogue Kit (MIDIKI)Modeling linguistic corpora in OWL/DL (POWLA)Modern Arabic Representative Corpus 2000 (MARC-2000)MorphoAdornername recognizerMultilingual Turin University Treebank (ParTUT)Multilingual UN Parallel Text 20002009 (MultiUN)Multi-perspective question answering (MPQA) sentiment lexiconNational Corpus of Polish Near-Identity Relations for Coreference (NIdent) CA Official Europarl test set from WMT 2008pantera-taggerParallel Corpora Collector (PaCo2)Parameterized & Annotated CMU Let's Go Database (LEGO)Persian Treebank (PerTreeBank)Perspicuous and Adjustable Links Annotator (PALinkA)PORT-MEDIA DomainRecognising Textual Entailment (RTE) 2 Test SetReference Corpus of Contemporary Portuguese (CRPC) Modality SampleSerbian morphological electronic dictionary (SrpRec)Serbian morphological electronic dictionary (SrpRec) Serbian Wordnet (SrpWN)SignWriting improved fast transcriber (SWift)Similarity Metric Library (SimMetrics)SMI Remote Eye Tracking DeviceSpeech Database in Basque for Synthesis and Voice Conversion (AhoSpeakers)SPeech Phonetization Alignment and Syllabification (SPPAS)STEVIN Nederlandstalig Referentiecorpus (SoNaR)Swedish Kelly listSyntactic lexicon for French (Lefff)The AQUAINT Corpus of English News Textthe BiLingual Annotator/Annotation/Analysis Support Tool (Blast)The Concisus Corpus of Event SummariesThe DGT Multilingual Translation Memory of the Acquis Communautaire (DGT-TM)The Freiburg - LOB Corpus of British English (FLOB)the open parallel corpus (OPUS) TreeTaggerUK PubMed Central (UKPMC)Verb Pattern Sample, 30 English verbs (VPS-30-En)W2C - Web To CorpusWeb-Harvested Corpus Annotated with GermaNet Senses (WebCAGe)Word clAss taGGER (WAGGER)WordNet Libre du Franais (WOLF)A free/open-source marker-driven example-based machine translation system (OpenMaTrEx)A Treebank for Finnish (FinnTreeBank)Automatic Syntactic Analysis for Polish Language (ASA-PL)EuroparlGNU Linear Programming Kit (GLPK)Joint Research Centre JRC-AcquisSoftware for Clustering High-Dimensional Datasets (CLUTO)Syntax in Elements of Text (SET)Topic Detection and Tracking (TDT Phase 3)Pronunciation Errors from Learners of English Coprus and Annotation (PELECAN)A Chinese-English Code-Switching Speech Database (CECOS)Indiana Cooperative Remote Search Task (CReST) CorpusTask-10 test and training data Semeval 2010Bijankhan CorpusFeauture Vector Set and Tree Forest (SVM-LIGHT-TK 1.2)PRObability-based PROlog-implemented Parser for RObust Grammatical Relation Extraction System (Pro3Gres)Corpora for Named Entity Recognition of Chemical Compounds (IUPAC Corpus)Domain Adaptive Relation Extraction (DARE)EuroparlHCRC Map Task CorpusIcelandic Frequency DictionaryJoint Research Centre (JRC) NamesMulti-Perspective Question Answering (MPQA) Opinion CorpusMulti-Perspective Question Answering (MPQA) Opinion CorpusOpen Mind Indoor Common Sense (OMICS)Rhetorical Structure Theory Tool (RSTTool)SRI Language Modeling Toolkit (SRILM)Stanford Log-linear Part-Of-Speech TaggerSuggested Upper Merged Ontology (SUMO)Treebank-2 TBa-D/Z Release 7Unified Medical Language System (UMLS)Unified Medical Language System (UMLS) MetathesaurusConVoteEnglish/Hindi and English Arabic Gold Standard for TransliterationKDD-D / KDD-T DatasetsCoNLL 2008 Shared Task DataBrown Coherence ToolkitUnsupervised Dependency Parser for English v 1.0ptbconv-3.0Penn TreebankNEDNAIST Text CorpusKyoto Text CorpusSemEval 2010 Coreference Resolution Task CorpusTextProChaSenDeSRCaboChaMovie Review DataMulti-Domain Sentiment DatasetSubjectivity LexiconFrameNet 1.5SEMAFOR 2.0WordNetDIRTSherlockTPEAraucariaTAC 2009 Document SetsArabic Treebank (ATB)Corpus of Arabic Functional MorphologyMADAMaltParserArabic Syntactic Dependency Parser ModelColumbia Arabic Treebank ConverterDocument Understanding Conference (DUC) CorpusWMT 2010 Translation Task DataNTCIR 2008 Training DataLexical Normalisation Annotations for Short Text Messages (LexNorm)Penn Discourse TreebankPenn Discourse Treebank ParserAutomatic Text Labelling of TopicsLarge Movie Review DatasetTarget-dependent Twitter Sentiment Classification AnnotationMPQA Opinion CorpusNTCIR Opinion CorpusTAC KBP 2009 DataTDT5 Brown Word ClustersEuroparlJoshuaCharniak parserACE 2005CCGBankSAMT extensionEvaluation Data for Hyponymy Relation MiningGoogle N-gram corpus Web 1T (2006)Diversity in Collective DiscourseQuery Dataset for Email SearchNIST 2009 MT Evaluation SetCrowdsourced translations, edits, and rankings for the 2009 NIST Urdu-to-English datasetSwitchboard CorpusCLC FCE DatasetWikipediaNELL Ontology and Knowledge BaseMicrosoft Research Video Description CorpusMADA (Morphological Analysis and Disambiguation for Arabic) tool kit.LinGO Grammar MatrixTAC KBP Annotation and Assessment GuidelinesNUs Corpus of Learner English (NUCLE)MosesSemeval 2010 word sense induction and disambiguation datasetHindi Projected Treebank from English-Hindi Tides Parallel CorpusCzech-English Parallel Corpus (CzEng)MapTaskEnron EmailsEvaluation Annotated/Unannotated data and evaluation code for NP coordination disambiguationStanford ParserUNT Computer Science Short Answer Dataset v 2.0LingPipeEnglish Incremental Right-Corner Grammar for HHMMupparseDeceptive Opinion Spam Corpus v1CELEX2British National Corpus (BNC)Classification of News Articles on Contentious IssuesPenn Chinese TreebankHebrew and Arabic Morphologically SegmentedEnglish GigawordMessage Understanding Conference (MUC) 4 Terrorism CorpusKonan-JIEM Learner CorpusCoNLL-X and CoNLL 2007 datasets20-NewsGroupsWebKBXinhua ChineseStanford Log-linear Part-Of-Speech Taggerir4qa_evalService Quality Evaluation Data SetMinna no Hon'yaku (MNH, Translation for ALL)ANERcorp + Our Own Corpus20newsgroupReuters-21578SpamassassinScale datasetSimple Rule Language Global Health RulebookSimple Rule Language EditorBioCaster OntologyRen-CECps 1.0NTCIR's Japanese patent document corpusNEU-Restaurant-ReviewRestaurant-Review-Snyder and Barzilay (2007)ROUGE-1.5.5breakSent-multi-lf.plSemCorChinese CCGbankChinese Penn Treebank 6.0Web 1T 5-gram corpus30 noun pairs from Rubenstein and Goodenough, and by replacing them with their definitions from the Collins Cobuild dictAmazonCNProductReviewsFreeLangMeSH (Medical Subject Heading)Multilingual glossary of technical and popular medical termsFIRE 2010 dataenglish to hindi dictionary shabdakoShaUnsupervised incremental parserNEGRAChinese PennTreebankWSJ Penn TreebankChinese Proposition BankPDTBIRNA newspaper text corpusAryanpour Persian to English dictionaryUSENET corpusMATEUkwabelana corpusDocument Understanding ConferenceROUGEBNCThe Switchboard-1 Telephone Speech CorpusTREC-8 collectionLucenewekaAn unsupervised incremental parser (CCL)Chinese product reviewsNTU sentiment dictionaryAppraisal lexiconMPQA subjectivity lexiconSentiWordNetMovie review data setStanford POS taggerReutersISOLETEuroparl v3NICT JEL corpusTo be announcedSzeged LVC CorpusmorphStanford Lexicalized ParserMorfessorCELEXLingua::JA::Summarize::ExtractMeCabText summarization corpus for the credibility of information on the WebTSUBAKIProp BankPennTree BankFBIS Corpustest set containing non-compositional and compositional phrasesEnglish_VPNtest set for QA candidate rankingmstparserHowNet Knowledge DatabaseMaltConverterChinese Treebank 5.0movie review datasetbooks, DVDs, electronics, and kitchen appliancesPT-EN /EN-PT translation lexiconYahoo! Answers QA Pairs under Healthcare DomainUGC tokenizerPortuguese Twitter corpusCLEF 2009 test collectionsNLTKDoshisha eye-gaze dialogue dataBerkeley ParserukWaC corpusCQPTDT4Evaluation Benchmark for Bilingual Lexicon ExtractionFreelingChinese Temporal Annotation Data SetChinese TreebankBrandeis Annotation ToolILSP/ELEFTHEROTYPIA MODERN GREEK CORPUSNTCIR patent corpusICTCLASarXMLivFresaBrill's TaggerEnglish Penn Treebank, Chinese Penn TreebankSentence Re-ranker based on Information Extractionanswer selection datasetRTE data setACE 2004 training dataJenaStanford NERAQUIANTTREC QuestionsChinese emotion lexicons(five emotions)Chinese Language Technology PlatformHowNetCOAE2008-task3Cross-lingual event predicate clustersevent annotated ontonotesEvent Annotated Carbon Sequestration data?????(Internet lexicon SogouW)Web 1T 5-gram Version 1natural language toolkitWordNet 3.0Yahoo! web searcherFlorianpolisWordNetBR or TEPBTECad hoc tasks of TRECOpenNLPOpenCalaisKEABrown corpusTycho Brahe parsed corpusTSUBAKI document collectionMulti-media Multi-lingual concept, relation and event annotated corpusConcept mapping table between video and textSemeval 2007 English Lexical Sample corpusChasen Japanese Language parserNTCIR-8 Mainichi Shinbun 2005Semeval 2010 Japanese Lexical Sample corpusTREC AP corpusPorter stemmerDeliciousVerbOceanNLTK for PythonFrameNet and WordNetBioInferMoguraIEPAAkaneREHPRD50AIMedLLLCKIPStanford Named Entity RecognizerNTCIR-8 Patent Translation dataJUMANMainichi NewspaperDUC 2007 Summaries and Pyramid annotationsBilingual corpus on patent domainComplexChineseQAtestdataDegExtNTCIR CLQA Chinese QuestionsThe Penn Chinese Treebank 6.0DUC-2006, DUC-2007 dataPenn Chinese Treebank 6.0Dan Bikel’s randomized parsing evaluation comparatorBayonKNPNIST MT 03-06 training and test corporaXinhua of GigawordBi-sentences,lexicon LDC2005T34,Name Entity LDC2005T34NIST MT 03-05 training and test corporaText Analysis ConferenceGIZA++SogouTManipuri POS taggerNamed Entity Recogniser for Manipuri using SVMPOS.LMManipuri StemmerSRILMManipuri-English Parallel CorpusYAMCHAStanford Dependency ParserTinySVMNISTmorphaJournalisticNL11CTB6Lefff 3.0Word Relatedness DatasetsANNODIS corpusGoogle Web1T corpus (LDC2006T13)IMDB actorsUMLSEuroparl and News-Commentary corporaReview referencesPeoples Daily from 1993-1997BLOGS06TREC corpus Disk4&5WT10gTreeTaggerGerman LFG grammarTiGerCoNLL 2000 shallow parsing data setEnglish Chinese Translation Treebank 1.0Chinese Treebank 6.0500M Japanese Sentences on the WebJeuxDeMots lexical networkMultilingual Statistical Parsing EngineEvent ExtractorEnjuCharniak-Johnson reranking parserC&C ToolsThe LTH Constituent-to-Dependency Conversion Tool for Penn-style TreebanksGDepGENIA treebankBioNLP'09 shared task data setPWKPEnglish WikipediaTDT corpusAlpinoTwente Newspaper CorpusThe Prague Dependency Treebank 2.0English-Korean Parallel CorpusTIDES Extraction (ACE) 2003 Multilingual Training DataKorean RDC corpusPeople's Daily CorpusUyghur to Chinese MT corpus (UCC)Bengali NEWS Editorial Opinion CorpusBengali Blog Opinion CorpusSentiWordNet (Bengali)MPQAMUC6Matlab SOM-ToolboxWordNet 2.0ir packageJWNLCorpus of Interactional Data (CID)A Large English-Chinese Parallel CorpusLinguistic Data ConsortiumBengali NEWS Editorial OpinionUyghur Encyclopedia (UE)target datasetLTPFNE datasetFZ NER ToolShared Swedish/English Regulus grammarEnglish Gigaword Fourth EditionFBIS and MTC data setsBLEUSynmttkGerman dependency treebank with new automatic featuresA Uyghur Tokenizer and part-of-speech taggerChinese Penn TreebankWikipedia (English)Wikipedia (German)dict.cc lexiconUyghur parsing corpusJuliusSimultaneous Interpretation DatabaseClause Boundary Annotation ProgramDUCEncycloMedical Subject HeadingsDutch Sentiment Lexicon (adjectives)DECA Species CorpusSimulated Contact Center DialoguesSecond International Chinese Word Segmentation Bakeoff DataFive PPI CorporaChinese Learner English CorpusPropBankChinse Verb Error Evaluation CorpusEnglish Gigaword Second EditionEncarta treasuresChinese Proposition Bank 1.0Bikel parserMSNBC News test setYahoo! News Resolution SetListening-oriented DialoguesIban-English LexiconIban corpusQTagIban-Malay LexiconWorNet 3.0GATEEmotion holder AnnotatorEmotion Blog CorpusWordNet AffectVerbNetCRF ChunkerEmotion Topic annotated blogAffect databaseSentiFulSentences from Experience ProjectConnexor Machinese SyntaxWMT 2010 system combination task corpusBerkeleyParserAAC-Austrian Academy CorpusWikipedia dumpTDT3, TDT4, TDT5Simple English WikipediaEnglish WiktionaryGCIDEMicrosoft Research Question Answering CorpusSimple English WiktionaryOmegaWikiSUCREFeature-GroupingEnglish TreebankVarro ToolkitCIPS-eval dataFreebaseEnglish - Bengali Parallel CorpusMalt ParserFrench Tree BankGH-MAPCoNLL Shared Task 2009 CorpusPenn Chinese Treebank 5.1Tsinghua Chinese TreebankMPQA Opinion Corpus version 1.2 with additional judgmentsSpanish-English EuroparlVerb Noun ListVerb OceanSogou Query LogAOL 2006 Query LogCLEF corpusWeb ServiceEnglish Web as Corpus (ukWaC)Penn Chinese Treebank 5MPC -- multii-party chat corpusmovie review data subjectivity datasetsGALE Y1 Q2 Release - LDC/FBIS/NVTC Parallel Text V2.0Revenue CorpusSRI language model toolkitCRF++Stanford Chinese Segmentermteval scoring scriptJoshua - open source hierarchical phrase based systemEvent and non-event nouns test-setStanford NLP POS taggerEMMA - Evaluation metric for morphological analysisEnglish to UNL Corpusidiomatic sentences test datasetWan's keyword extraction datasetKyoto University Text CorpusA simple C++ library for maximum entropy classificationpeccoHyponymy extraction toolChinese-English Translation Lexicon Version 3.0Eijiro, Third EditionEDR Electronic DictionaryWanfang Data Chinese-English Science and Technology Bilin-gual DictionaryEDICTMainichi Shimbun CorpusmmaEnglish Gigaword Corpus Fourth EditionNews Tweets For SRLTimeBankCCG SRL toolRussian corpusDMoZ corpusACE 2007BioCreative 2 Gene Mention Recognition CorpusCoNLL 2003 NER shared task corpusSCD-based Dictionary Entry ParserEuroparl with multilingual synsetsCooperative Remote Search Task (CReST) corpusMCPGGyaanNidhiGyaan NidhiLexiconEILMT parallel corpusACESRL annotated data for UrduSALTOHebrew annotated documentsmulti-score summarizerTAHAnon-native speaking data dataMicrosoft Web N-gram ServicesWiktionaryBritish National CorpusCDS datasetGold Standard for Sentence ClusteringEmotiBlog corpusEmotiBlog annotation modelTIGER CorpusEmotion Annotated data for UrduDUC 2002 Summarization CorpusWordNet Sense RelateHPSG-WSJMarkus DickinsonHindi Dependency TreebankSoftwareConsumerMeterA Corpus of Plagiarised Short AnswersPAN-PC-09Multi-Domain Sentiment Dataset (version 2.0)Rapidminer 4.6JWI (the MIT Java Wordnet Interface)Penn Discourse Treebank 2.0RST Discourse TreebankUKB: Graph Based Word Sense Disambiguation and Similarityydta-yanswers-manner-questions-v1.0MG4JLibsvm20 Newsgroups Document Categorization datasetR52UIUC Question classification DatasetThe Multimillion Q&A Pair CollectionMicroblog/Twitter Summarization Data SetJavabased MaxEnt packageIJCNLP-08 NERSSEAL Shared Task DataNamed Entity Annotated DataWeb-based Bengali news corpusRTGenGenISpatial patterns datasetMRP Readability CorpusReuters Corpus annotated with NP coreferenceSentiProductLexiconTERGenSemUrdu Resource GrammarInspec Database Keyword Extraction Data SetDocument Understanding Conference Past Data for Text SummarizationASKNetConceptNetPicture Books OntologyCoreference resolution data in opinion mining domainBAF corpusEuroparl (Manually annotated)Unsupervised, Language Independent Sentence AlignerNE listsBKB (BKB-nytfootball-v0.7.5)BioNLP'09 Shared Task on Event ExtractionCMC ICD coding corpusObesity datasetMorphadorner2007 Computational Medicine Center Challenge CorpusTextual Entailment Specialized DatasetSelf-Annotation ToolOpenNLP ToolkitFSParGerman Wikipedia articlesWordNet 2.1SenseLearner 2.0Stanford POS Tagger 1.6Stanford Names Entity Recogniser 1.1SIGHANTreebankAdaptiveCorefNgram Search EngineHansard Corpus, Public Release of Haitian Creole Language Data by Carnegie Mellon, FBISCelebrityOpen Directory Project Full CorpusWePS benchmark dataEvaluation of BioExcom on the BioScope corpusEmotiNetFormality Word ListsICWSM 2009 Spinn3r Blog DatasetmwetoolkitGeniaMWE 2008 data setsTiger treebankReuter-2157820-NewsgroupPenn TreeBank 3, Switchboard corpus partIndic Language Transliteration DataEnglish Gigaword CorpusBeijing language acquisition corpusJ KalitaItalian CCG Treebank (CCG-TUT)ERGenju HPSG parserMEDLINE databasesentence taggeranonymized-for-blind-reviewlongest_nereeMedT_NERLexeedRomanized Text Language IdentifierWMT 2009 datasetSinica Corpus of Modern ChineseGeneral EnquirerCnet product reviewsArabic Penn TreebankACE 2005 ArabicACE 2005 Handwritten ArabicLDC2008T19DUC2002Generative Semantic Parsing Model using Hybrid Tree FrameworkWASP and WASP^{-1}Robocup sportscasting corpusWeb Dataset for Text-based Image Annotation DevelopmentPTB NP Bracketing Data 1.0Google V2 and n-gram toolsComputerWorldEnglish Entity Detection and Tracking corpus for 2004 ACE projectMultilingual MPQAPerseus Latin Dependency TreebankRWSCorPMC Open Access SubsetC&C toolkitKorean emotional speech corpusKorean TV drama scriptsOntology created from Wikipedia Animal articlesChinese Emotion Corpusoracle database qa forumThe 4 Universities' datasetDish Names in Chinese Language Blog ReviewsWikipedia Vandalism Corpus WEBIS-VC07-11Question RankingChinese-English sentence level aligned bilingual corpusMICA20 newsgroupsReuters RCV1ChineseBookDescriptionWithTagsChineseBlogsWithTagsDORISpointingApril10TSUBAKI CorpusKoeling et al. (2005) corpusKanji TesterKanji Tester response logsJMdictChinese law articlesICTCLAS(Institute of Computing Technology, Chinese Lexical Analysis System)Chinese acedemic papersBioScopeUW parallel meeting corpusdict.ccopen thesaurusdingde-newsJWPL/JWKTLTranslation AnnotatorJRC AcquisKulkarni Name CorpusTWA sense tagged dataAcl Anthology Network (AAN)GermaNetPenn Tree BankStockholm Ume CorpusCHILDESJapanese WikipediaMMSEGNLTK packageText_Classification_Reuters_CorpusA thesaurus of argument structure for Japanese verbsMEDLINE/PUBMEDDBLPWorld Atlas of Language Structures (WALS)FrameNetSemEval2010-task 10Tagged Medical Forum DataChinese Collocation Dictionary of Content WordsTaKIPIBTaggerIPI PAN Corpus of Polish (manually disambiguated part)Yahoo!AnswersTest dataAmazon Reviewsi2b2 2009 shared task on medication extractionIREX data set for NE recognitionWikipedia Infobox ExtractsMUC Coreference Data SetACE-2 data setCoNLL 2005 DatasetFATE corpusFN transformerShalmaneserDUC 2002 DatasetKyoto Text Corpus version 4.0mogura HPSG parserBLIPP 1987 & 1988 corpusPenn Treebank 3.0Chinese Treebank 5.1NTUSDTYPO (misspelling) CORPUSMedisysAnnotated corpusOpinosis Summarization Demo SoftwareTopic Related Review SentencesMEAD Summarization ToolStanford's NLP ParserAmazon Mechanical TurkHercules DalianisRelevant Term extractorWikiXMLDumpTextExtractorSighan 2005 bakeoff dataSogouT CorpusAutomatic Content ExtractionLangid.pyMPQA DatasetBiomedical Gene Mention Linking CorpusGraded Compositionality Scores for Compound Nouns"Yahoo! Chiebukuro" dataUofT Blog CorpusJapanese National Pension Law CorpusProbabilistic Word Classes with LDALanguage Function Analysis Corpus (LFA-11)LGLexBaidu Zhidao CorpusAmazon.com Review Rating Prediction DatasetBalanced Corpus of Contemporary Written Japanese (BCCWJ)NTCIR-3 WEB (Web Retrieval Test Collection)Extended REX-J CorpusSemEval 2007 Lexical Substitution Task DatasetBS Computer Science CorpusMicrosoft Research IME CorpusCLP2010 Testing Dataset of the Chinese Word Sense Induction TaskKnowledge Base Population Corpus (TAC KBP)Customer Review DatasetJapanese Extended Named Entity CorpusStockholm-Ume Corpus (SUC) 2.0UIUC Question Classification Data (Training set 5)TREC Entity 2010Digital Review Data SetChinese Web 5-gramBasic Travel Expression Corpus (BTEC)North American News Text CorpusMMAXa User-Extensible Morphological Analyzer for Japanese (JUMAN)Darpa TIDES Surprise Language DatasetFTA Labeled ACL Anthology AbstractsSandhi Parallel CorpusGALE LDC Parallel DataGeppettoNatural Language Programming CorpusHyderabad Dependency TreebankCoNLL 2006 Shared Task DataWeb 2.0 TreebankCross-Language Entity Linking Test CollectionContact Center DataCantonese-Mandarin Parallel CorpusOpenMWEErgbioQuaero Broadcast News Named Entity CorpusYUWEI CorpusNihongo Goi TaikeiHindi TreebankWord Clipping Test SetMarkov thebeastMorfetteSupervised Latent Dirichlet Allocation for ClassificationAutomatic Statistical SEmantic Role Tagger (ASSERT)Kyoto Text Analysis Toolkit (KyTea)Document Understanding Conference (DUC) 2005 DatasetAdditional Review Datasets (9 products)LINA-PAL 1.0Sandhi RulesNamed Entity Recognition Corpus from the Fourth International SIGHAN Bakeoff Data SetsNTU Sentiment Dictionary (NTUSD)Sogou User Input RecordKadokawa Ruigo Shin JitenGraph Based Word Sense Disambiguation and Similarity (UKB)NAIST Japanese DictionarySemisupervised Named Entity Recognizer (SemiNER)Collapsed Gibbs Sampling Methods for Topic Models (lda)SnowballKyoto University's Case Frame Data 1.0Transcoding Sanskrit Formatsa Library for Support Vector Machines (LIBSVM)General InquirerMulti-Domain Sentiment Dataset 2.0Steady Selling Product Review DatasetLIBLINEARMainichi Newspaper DatabaseNew York Times CorpusJava Syntaxico-semantic French Analyser (J-Safran)WSJ0-WSJ1A Praat script for extacting pitch targets from vocal signals (PENTAtrainer)Augmented Multi-party Interaction (AMI) meeting corpusAURORA Project Database 2.0 - Evaluation PackageBABEL Hungarian Speech DatabasesCentre for Spoken Language Understanding (CSLU) Names v1.3CMU Let's Go DataCMU_ARCTIC speech synthesis databasesCzech Speecon databaseExtensible Markup Language for Discourse Annotation (EXMARaLDA) Partitur-EditorHIWIRE (Human Input that Works In Real Environments) databaseInstitute for Signal Processing (ISIP) environmental noise signalsItalian SpeechDat(II) Modular Architecture for Research on speech sYnthesis Text-to-Speech System (MARY TTS)Quaero named entity corporaSpanish SpeechDat(II)Swiss-French SpeechDat(II)Swiss-German SpeechDat(II)The Accents of the British Isles (ABI-1) Speech CorpusThe EMIME Mandarin/English Bilingual DatabaseTIMIT Acoustic-Phonetic Continuous Speech CorpusWSJCAM0 Cambridge Read NewsTransducersaurusKLAIR ToolkitObject-Based Seech RecognizerPhonetisaurusSyllitestRussian Emotional Corpus (REC)Boston University Radio News CorpusQuaero Extended Named Entities annotation guideVery Large Pronunciation Vocabulary for RussianAurora Project Database - Revised Aurora Noisy TI digits database - (Version 2.0)AhoVoicedDBAMI corpusUtsunomiya University Spoken Dialogue Database for Paralinguistic Information StudiesBoston Radio News CorpusKALAKA-2SpeechDat(II) ENWinPitchJapanese phonetically-balanced word speech databaseLIPS 2008 AV CorpusNIST LRE 2007EMU Speech Database SystemWashU-UCLA Corpus of Subglottal AcousticsAKUEMEnglish Read by Japanese (ERJ) databaseSpeech Feature Tool available at Centre for Speech Technology University of EdinburghTest Audio Data for Repeatition DetectionTranscriberWitchcraft WorkbenchDomainEditorHOESIElectropalatographic corpus for Standard ChineseRomanian Speech Synthesis corpusKALAKA 2The Edwardians: family life and work before 1918Quaero Named Entities evaluation toolAnnotation guidelines for Dutch-English word alignmentGold Standard corpus for Dutch-English word alignmentHandAlignxpressive Speech Labeling Tool Incorporating the Temporal Characteristics of EmotionNIMITEK CorpusMulti-Lingual Image captionsCatchWord Speech SynthesiserLexique et grammaire de drivationA fragment of Northern Sotho grammar: The verb of Northern SothoPeykare Or Textual Corpus of Persian LanguageStuttgart Finite State Transducer ToolsWikipedia (Turkish section)TRmorphZemberek spell checker word listMETU Turkish CorpusText+Berg CorpusAttribution Corpus of ItalianISSTMMAX2AVALONBase de datos sintcticosCorpus of temporal-causal structureThe English-Swedish-Turkish Parallel TreebankLink GrammarProject documents ontologyCLANNWordsmith toolsUnitexCalendar Expression Semantic TaggerNaviTexteAlborada-I3A corpus of disordered speechCharlatan Synthetic Dialog CorpusEEP search interface evaluationReal-word error corpusA list of confusion setsWitchcraftNo nameCorpus Chaines de Reference (CoChainRef)TTLCoreference Named Entity (CoRefEN)Coreference Chain Genre dependent identification module (CoRefGen)DPC: Dutch Parallel CorpusA Parallel Corpus of Monologues and Expository DialoguesOntoLing's ontologiesLAF/GrAFMETHONTOLOGYWebODEOntoTag's ontologiesOntoLing annotation modelA taxonomy of discourse (coherence) relationsPolish websites corpusWikipedia MinerOpenThesaurus2009's Text summarization corpus for the credibility of information from the WEBLX-ParserLX-Parser WebserviceLX-ServiceAdd MS KitStandards for Controlled LanguagesCourse material, writing manual and evaluation techniquesCourses on the writing of safe and safely translatable alert messages and protocolsMULTEXT-East Version 4JOSSloWNetFidaPlusSlovene Term Extractordifferential semantics synset annotationMultimodal Russian Corpus (MURCO)Tree-to-tree alignment tool Lingua-Alignbilingual corpusKalashnikov 2K dependency bankFISCALDBMARITERMItalWordNetCMT Corpus of Maritime terminologyCFT Corpus of Fiscal TerminologyEuroWordNetWordNet 1.5 and 3.0SINDACDBCST Corpus of Synsicate-labour terminologyVOLEMDictionary of Affect in LanguageSentiWordNet 1.0.1XuxenBfomaPELCRA Search Engine for the National Corpus of PolishAnotatorniaText Encoding Iinitiative (TEI)National Corpus of PolishPoliqarpFAUSSCINTIL Logical Form BankCINTIL DeepGrambankCINTIL TreebankCINTIL Dependency BankCINTIL CorpusCINTIL PropbankmorphistoGERTWOL/GERGENStripey ZebramOLIFdeDiabasegrande grammaire du franaisTagged and Cleaned WikipediaProprietory Interactive Voice Response CorpusNear-Identity Relations for Coreference (NIDENT)Annotated Corpora (AnCora)OntoNotesUtility Evaluation for Information ExtractionSonarText Encoding Initiative (TEI)Gold Standard for Dutch sentiment bearing adjectivesMulticultural Romanized Name MatchesCorpus of Ambiguous Abbreviations and Gene Names in the Biomedical DomainTRIPS OntologyTRIOS-TimeBank corpusHebrew-English transliteration dictionaryService-Finder Automatic Semantic AnnotatorJapanese Lexicon AcquirerJapanese Web CorpusHebrew CHILDES corpusSpeechDat, Callfriends, Broadcast newsGTAAdbpediaEASYLEXIndex Thomisticus TreebankCommentExtract 1.0STExAssociative Concept Dictionary (ACD)Associative Concept Dictionary for VerbsTest Bench for transliteration of Indian language to EnglishNomage lexiconFrench TreebankVerbactionNomage CorpusLUNA corpus of conversational speech in ItalianThe NumGen (Generating Numerical Expressions) CorpusQuæro QA corpusMorphOzWEB-QA and TREC-QAGRANSKA taggerSUCCogFLUXEXMARaLDAFOLKERdrhumanEster 2 Named Entity CorpusQuæro named entity corporaMainichi News PaperAozora BunkoLinES guidelinesI*LinkMachinese SyntaxLinköping English-Swedish Parallel Treebank (LinES)Alpaco alignment editorUrdu Transliteration ToolsAMI meeting corpusISST-TANL Dependency Annotated CorpusTurin University Treebank (TUT)AppraiseSpeech Recordings for Unit Selection CorpusEvaluation Tool for Subjective Loudness PerceptionSIGHAN Bakeoff 2006 Chinese Word Segmentation DataVALLEX 2.5TrEdPDT-VALLEXPML-TQPrague Dependency Treebank 2.0DanNetNeue Zrcher ZeitungHaGenLexRegression-Forest TaggerPUNKTPreposition Noun CombinationsPiNERDbt & FaccetteLymba's abbreviation dictionaryKYOTO-architecture / Knowledge Yielding Ontologies for Transition-based Organization’ - architectureSrebrenica corpusBitParPAC (Predicate Argument Clustering)RCV1Semantic spacesAppraisal Lexicon (lexique de l'valuation)ApopsisDeCoSoNaR Named Entities AnnotatierichtlijnenSpeeralESTER 2005 database development setGigawordIMAIL-SSIA-2009ZAPIwebGSEDC (Gold Standard for Event Detection in Croatian)Genia corpus annotation for BioNLP/NLPBA 2004Czech Morphological Analysis in PDT 2.0TectoMTCross Language Evaluation ForumIR Multilingual Resources at UniNEtrec_evalNon-projective dependency parsing using spanning tree algorithmsISST-SSTBasque WordNetEDBL : Lexical database for BasqueEULIA: Tool for Morphological AnnotationEustagger: lemmatizer/tagger for BasqueEPEC (Reference Corpus for the Processing of Basque).AbarHitzEUSEMCORlibiXMLBasque Dependency Tree Bank (BDT)EiheraSyllabeur-v2.1.jarWord Sketch Grammar for RussianSketch EngineIRASubcatWikicorpusUKBSenSemPIITHIE corpusJRC Acquis Latvian-English parallel corpusMARIEEnglish-Spanish Large Statistical Dictionary of Inflectional FormsHIFI-AVGUM-3-SpaceGUM-Space: Evaluation Data and Annotation InstructionsTools for querying an N-gram databaseTools for web-scale N-gramsMPI/DOBES Language Resource ArchiveSlovenian Lombard Speech DatabaseMuLeXFoRPukWaCWaCkypedia_enDEXONLINENEOROMRoMorphoDictLuconMapudungun-Spanish MT test suiteMapudungun-Spanish AVENUE Machine Translation Grammarlist-question-answering pargraphsPANChinese Opinion TreebankOpinion Annotation Tool (OAT)MNH SDF corpusStockholm EPR Corpus / Speculative clinical textOfficial Documents of the Congress of Deputies in XML formatAfrican WordNetCASIA-CASSILLarge Vocabulary Thai Continuous Speech Broadcast News corpus (LOTUS-BN)TLexs: Thai lexeme analyserGranular Time Ontology for Temporal UnderspecificationThe D-TUNA CorpusAnCora-Nom-EsAnCora-Verb-EsAnCora-EsADN-ClassifierBase de Franais MdivalNouveau Corpus d'AmsterdamNotaBen RDF Annotation ToolGiellatekno parserNameDatNordisk Sprkteknologi (NST) corpusOnomastica interlanguage pronunciation lexiconText handlerChamber debatesCIAIR Back-Channel Utterance CorpusCIAIR in-car speech corpusGerman Voice Services Agender DBDutchParlGernEdiTLEX corpusLEX monolingual corpusEUR-LEX translation memoryCollection of newspaper articlesThe Prague Dependency Treebank 2.0 (PDT 2.0)Message Understanding Conference (MUC) 6Mor?eSemantic Annotation Tool (SAT)D-Coi corpusCornettoCGNNijmegen Corpus of Casual Spanish (NCCSp)GIRASIGAVirtual Language WorldCORPRESFAU IISAHThe Quranic Arabic CorpusBAStatSyntactic Annotation Guidelines for the Quranic Arabic Dependency TreebankVariKN Language Modeling toolkitUniversal Declaration of Human RightsGoogle AJAX Language APIAyDASpanish2MSLDepartment of Education Text BooksAUTONOMATA Spoken Name Corpus (ASNC)AUTONOMATA-g2p-toolkitFine-Grain Morphological Analyzer and Part-of-Speech Tagger for Arabic TextSPECIALIST dTaggerMXPOSTHealth Information Readability CorpusCurran and Clark POS TaggerFunGramKB OnomasticonMicroKnowing: Microconceptual-Knowledge SpreadingFunGramKB LexiconFunGramKB GrammaticonCOREL: Conceptual Representation LanguageFunGramKB SuiteFunGramKB OntologyFunGramKB MorphiconWTIMIT 1.0Eindhoven CorpusThe CELEX Lexical DatabaseFrequentielijst 27 Miljoen Woorden Krantencorpus 1995broad-coverage lexical resource of ArabicPubMed CentralAVLaughterCycle databaseSmart Sensor IntegrationI-EN-SAMPLEKI-04SANTINISHGCMGCKRYS-ICorpus del espaolCorpus de Referencia del Espaol Actual (CREA)LPCC - a large parallel corpus of cleftsRetokenized EuroparlJava WordNet::Similarity (beta)JWPL-Java Wikipedia LibraryJava Statistical ClassesAnymalignUniversity of Maryland Parallel Corpus Project: The BibleThe Berkeley Word AlignerCC-CEDICTMGIZA++Bible Bilingual LexiconsMPROAUTOTERMAutonomata TOO native infrequent and multilingual speech corpusMainichi Newspaper ArticleBusiness News Story CorpusEvent and Sentiment Segmentation Gold StandardAustrian Phonetic Database (ADABA)ELAN annotation toolOntology for Equipping Upper and Domain Ontologies With TimeTREATTree taggerRitel-ncaXeros Incremental Parser XIPMulti-Annotated Corpus of Answers to Questions, MACAQveneto-english parallel corpusHMM-based dialogue annotationDihanaSwitchBoardUnicode CBETA ArchivesMeta-Knowledge Annotation Scheme for Bio-Events (MeKASBE)U-CompareGlossaNordic Syntactic Judgments DatabaseNordic Dialect CorpusA Python Toolkit for Universal TransliterationETS Textual Entailment Test Suite for the Evaluation of Automatic Content Scoring TechnologiesChinese Character Data (Hanzi Data)Etymological WordnetSUMOUnicode Character DatabaseGoogle TranslateISO 639-3UWNLa RepubblicaItalian FrameNetItalian Valence LexiconMultiWordNetNorKompLeksFonema TTS front endSpontal-NBulgarian National CorpusMAATSRFTaggerMB-TaggerCText Tagset for AfrikaansAfrikaans Word ListsAfrikaans Beeld CorpusTnT TaggerCallSurf ManTransTopics-140Test suite for biomedical ontology concept recognition systemsSwedish fuzzy wordnetELANAcademia Sinica Balanced Corpus of Modern Chinese (Sinica Corpus)Emotion Cause Event CorpusGF Resource Grammar LibraryDADLIPS: Lexical Isolation Point SoftwareCambridge Cookie-theft CorpusVeteran TapesAnnotated Corpus of Difficult-Antecedent Referring Expressions (DAREs)Reference Engine Development and Evaluation EnvironmentEnglish-Galician Europarl parallel corpusBoB (Bozen-Bolzano Library Bot) user dialoguesRomanian Russian WordNet-Affect RoRuWNAPOS-Tagged New Testament in Wolof (Matthew gospel)ProtgEcoLexicon CorpusWordsmithToolsSMORELexicon of negation cuesThe PDTB XML converterLUNA.PLPDT 2.0 annotation toolsSupeSense TaggerSemeval-3 Task 6MyTerMSTermFactoryADL NLP Analytic ToolWest African Language Archive (WALA)pn-filterTsumugi-1.0.1A ECA-MSA LexiconSpoken Dialogue OntologyRWTH-BOSTON-50RWTH-BOSTON-104RWTH-FingerspellingRWTH-PHOENIXWomens Studies EncyclopediaAbstracts of the 39 JournalsWomen's Studies International ForumAustrian Academy CorpuscorpusEditorAAC corpusBrowserJWPLWikipedia (the Spanish version)CoCoAuthorship Paraphrase CorpusCross-lingual WSD Benchmark Data SetFipsRomanianNTCIR-1 Test CollectionSpontalNBA video pages collectionGATE Access and Interpretation of SentiWordNetDifficult Speech Corpus (DiSCo)CREAGESTAccentuatorAccentological corpus of RussianLanguage GridPAN Plagiarism Corpus PAN-PC-09Plagiarism Detection EvaluationCost-Conscious Annotation Supervised by Humans (CCASH)Speech database for unit selection synthesis of Viennese varietiesClean English ACE 2005 Event Trigger CorpusLingNetREMBRANDTAriadneItalian Legal FrameNetQASTLE (Question-Answering Systems TooL for Evaluation)CLEF - QAST 2007-2009 Evaluation PackageThe Database of Catalan AdjectivesTagalog WiktionaryICL-SearcherPeople's DailyChinese Semantic DictionaryAEGIR lexiconSPECIALIST lexiconNomBankAfazio TestSuiteNomLexXTagComLexGLISSANDOTreebank-3Penn Arabic Treebank 2TIGER Corpus 1.0The ICSI Meeting CorpusPenn Arabic TreebankTwitter corpusESTERLefffPrague Czech-English Dependency TreebankAMICA Medical Dialogue CorpusTwo-level utterance-unit annotation schemeMST parser (maximum spanning tree parser)French treebank converted into dependenciesMElt (Maximum Entropy Lexicon-enriched Tagger)Annnotation OntologyOntology-based Semantic Annotation X CorpusKALAKACollective Action Framing CorpusRainbowThe Revised Chinese DictionaryChinese Bi-Character Words' Morphological Types CorpusBlack Bean Chinese Word Segmentation SystemNTCIR CIRB040NIST Open Machine Translation (OpenMT) EvaluationThe Stanford ParserSwitchboard corpus annotated with dialogue actsANNIEsMailsMail Speech Act Mining RulesSummTermCESTA Evaluation PackageCINEMOJEMOArbilTMEKO, Tutoring Methodology for the Enrichment of the Kyoto OntologyKyoto OntologyKyoto TermdatabaseGeneRegKyTea - the Kyoto Text Analysis ToolkitGoldstandard of German morphological analysisStuttgart MORPhology (SMOR)Hebrew-English parallel corpusLAMPADATerm-minatorGerman Reference Corpus DeReKoPiTaggerJapanese WordNetEDRlexiconAffectiveTask SemEval2007AffectiveTask SemEval2007 Subsets with Figurative Language AnnotationsPolArtTerm UnionUMLFSignCom Projectdoxa-jv-corpusEurogene systemEurogene Multilingual Genetic OntologyDekang Lin's Similarity ThesaurusesEdit Distance Textual Entailment Suite (EDITS)Grammar error ratioGranskaIMS-corpus collectionSELFEHpassage cpcv3LiveMemories Corpus for ItalianWikiWoodsXLE English ParGram grammar + AKRKurzprotokolle des deutschen BundeskabinettsCompanion WoZ corpusMultimodal Task-Based CommunicationControlled Language in Crisis Management (CLCM)Guidelines for Evaluators for MT output errorsspoken language samples for the South African official languagesBAWERitual Descriptions CorpusWSJCAM0 British English speech databaseIraqi Arabic Did You Mean...?Information Science Institute Elicited Imitation CorpusArabic Nonnative Speaker Pronunciation Error ModelDictionary of Iraqi Arabic (Arabic-English)MAGEAD: A Morphological Analyzer and Generator for Arabic and its DialectsJAPIO patent abstractsEDR bilingual dictionaryJST scientific paper abstractsFredPowerset Gold-Parsed Wikipedia CorpusModality LexiconConceptMapperTMCEuropean name modelsPersian WordNetLwazi TTS corpora (v Sep 2009)Lwazi ASR corpora (vSep2009Lwazi primary pronunciation dictionaries v1.0BabyExpUPC_ESMAAC/DCcorte-e-costuraRomanian and English corporaRomanian and English word form lexiconsASIt - (Atlante Sintattico d'Italia, Syntactic Atlas of Italy)Gold Standard for Spanish Mass NounsVerb ThesaurusCSJ SDR Test collectionCSJSenso Comunetest collectionsSzeged Dependency TreebankDiccionario Clave de Uso Del Espaol ActualDicionario del Real Academia EspaolaDicionario General de la Llengua EspaolaDictionaries of Swedish names and common wordsStockholm EPR PHI CorpusAhoTransfDriFTechnical-Lay paraphrase lexiconDiabetes and cancer monolingual comparable corporaMalagaPROMETHEUS DATABASEMinefieldRODRIGOGIDOC prototypeSpecialized Datasets for Textual EntailmentThe Leeds Arabic Discourse Treebank (LADTB)Arabic Discourse Annotation Tool (ADA tool)the Penn Arabic Treebank (Part 1 v. 2.0)TTSlabCS1RITELCatalan Version of a Subset of the EuroVocSpLaSHFrench FrameNet built with Bilingual Dictionaries : FR.FrameNet.BiDicSCI-FRAN-EURADIC Dictionnaire bilingue français-anglaisSemantic Map @ CEA LISTFR.FrameNetB3DBAnnotated RTE-5 Search Data SetAugmented RTE-5 Search Data SetSanchaySentiWSInternalShort Movie Reviews for Opinion Mining and Authorship Attribution ExperimentsAnnoSemUtoolextension of Pang and Lee: polarity 2.0poldata_RatingExtractorOntoSearcherLocalNERReference Corpus of Contemporary Portuguese (CRPC)International Corpus of Portuguese (CINTIL)POLYMOTSA Dynamic HPSG Treebank of the Wall Street Journal sections of the Penn TreebankCross-lingual relatedness thesaurusMaltEvalHyderabad Dependency Treebank (HyDT)BART: Baltimore Anaphora Resolution ToolkitAutoTutorTCFHuman Annotation for Machine TranslationDK-CLARIN LSP corpusSystem for integration Corpus Management, Processing and AnalysisEuroparl v.2Connexor machines syntaxEnglish-Swedish word alignment gold standardMaximum Likelihood Linear Regression (MLLR)Improved Automatically Trainable Recognizer of Speech (iATROS)Wall Street Journal (WSJ)Kemo/DemoNon-canonical constructions in oral discourse: a crosslinguistic perspective (NOCANDO)ModeLex CorpusCREAM TBXENOVAJeuxDeMots French lexical networkMASKKOTCzech Multi-channel Speech Database of DSP LecturesWikiNER (semantically annotated corpus for Catalan)QAST 2007-2009 evaluation packageGeoNamesClearTKLanguage TagMooney dataset: geoquery dataMAtrixware REsearch Collection (MAREC)French WikipediaQA corpus for answer justificationDysLsCorefProItalian WikipediaEntity Mention ClassifierSWiiTHunglish1984-EH-NPGIVE-2 Corpus Collection SoftwareGIVE-2 CorpusWAPUSK20GRIDIACDiachronic Corpus of SpanishUrdu Verbs Lexicocn BuilderLIMAECI Multilingual Text CorpusCoNLL 2003 Shared Task Named Entity dataSemiNERA common benchmark for text wikification toolsThe Wiki MachineDepPatternInfomatKTHNC - KTH News CorpusSaturnalia corpusRecAlignWordsPubMedBioGRIDDjangology: A Light-weight Web-based Tool for Distributed Collaborative Text AnnotationCAVaT - Corpus Analysis and Validation for TimeMLWordNet SimilarityGAWEXUMLS Specialist LexiconLink ParserSoNar IPR Acquisition ManualACE data 2007FrameNet transformerLegal CorpusTerm ExtractorArt History CorpusILC - NLP Statistical ToolsEnvironmental -Legal CorpusPAROLEPOMDELAFCorpus VALIBELLS-COLINWebSourdDicoLSFVenProPetit Larousse Illustr 1905SelectPOSArabic Treebank Part3 - Version 3.1Corpus SearchArabic Treebank Part 2 v 3.0TreeEditorArabic Treebank Part 5 V1.0XTRANSThe Bikel Statistical Parsing EngineSAMAArabic Treebank Part 1 v 4.0HunalignMorphological tagger of the Corpus of the Contemporary Lithuanian LanguageHungarian-Lithuanian parallel corpusJOS ToTaLe text analyserMorphological tagger of the Hungarian National CorpusHungarian-Slovenian parallel corpusJames PustejovskyData categories for communicative functionsHindi/Urdu TreebankSpanish FreeLing Dependency Grammar (EsTxala)Simplified Corpus Semantically Annotated with Wh-Question LabelsACE (Automatic Content Extraction) 2005 CorpusLDC2009E73 Standard Arabic Morphological Analyzer (SAMA) Version 3.1Arabic Treebank part 3 version 3.2 LDC Catalog Number: LDC2010T08De Mauro Paravia Dictionary of the Italian LanguageItWacItalian MWE databaseEngvallexPerDiPa CollectionLoonyBinIdentity Matching Adjudication Collector+: IMAC+TIGR Annotation Guidelines for Entity Extraction and Information Retrieval Ground Truth CreationCarafeTIGR Evaluation Methodolgy / GuideMALLETApproaches to Automatic Quality Estimation of Manual Translations in Crowdsourcing Parallel Corpora DevelopmentUniversal Word - Hindi DictionaryHindi WordNetPrinceton English WordNetGiza++ ToolFFV Spectrum Computation in Snack Sound ToolkitThe GENETAG corpusThe AIMed corpusThe GENIA corpusThe BANNER NER systemiMAP referring expression corpusMusicNavi2 databaseWS4LRGeolISSTermSrpRecSrpWNBehavior Language Corpus DatabaseSimultaneous Interpretation Database (SIDB)Domain product featuresSmartSUMORoget's ThesaurusWSD-IXAZT Corpusa - Zientzia eta Teknologia CorpusaMorfeusIhardetsiLDC wordlistan online Chinese-English dictionaryLemurTREC Genomics data from 2006 and 2007an online Chinese-English biomedical dictionaryan online Chinese MeSHDatabase of Narrative SchemasElixirFMNational corpus of spoken SlovenianFrench and English Contexonym DatabasesFrench and English Synonym DatabasesFrench and English Translation DatabasesAOL query logIDIAP corpus of political debatesCAREGIVEREnglish-Latvian comparable corpus extracted from WikipediaCorAlStockholm-Ume corpusRuneberg projectEuropean Parliament Proceedings Parallel Corpus 1996-2006Unified Eventity Representation (UER)SemInVeSt (Semantically Interpreted Verb-centred Structures)Lexical Markup Framework (LMF)Spanish Resource GrammarPassageCORPS - a CORpus of tagged Political Speechesreference corpusLT4eL English Learning ObjectsSwedish Scientific Medical CorpusCALL-SLTRegulusRascalli Gossip DBYAGOMaster Metaphor ListMetaphor CorpusPenn WSJ Treebank v.3FragmentSeekerVergina speech databaseWikiNetEnglish-Gujarati DictionaryEnabling Minority Language EngineeringMorphological Analyser for South Asian LAnguagesGATE Morphological Analyser (part of the GATE system)Japanese Particle CorpusIDIXRASPPlayMancer DatabaseAuthor Gender Analysis of TextWorld Wide English corpusGann - Graphical Annotation ToolC-3 (Coherence and Coreference Corpus)Pilot Arabic CCGbankSERASympalog SymRecErlangen Corpus of Speech Recognition TranscriptsFrench dysarthric corpusOntology Library for Intelligence DomainOntology Library for Financial DomainYou tubeErlangen Valency Pattern BankDARESProUTPragmatic Resources of Old Indo-European LanguagesDeWaCDISCOTAC 2009 Knowledge Base Population Track: corpora, knowledge base, guidelines, queries, and assessmentsTechnical domain lexiconAlignment of FrameNet and WordNetANNEXLAT BridgeThe Lemur ToolkitTRECEVALEuskaltermMorris hiztegiaCLEF Data Collections: LA Times 94, Glasgow Herald 95, topics and human relevance judgementsCzEngAZ-II corpusSAPIENT:Semantic Annotation of Papers Interface and Enrichment ToolCoreSC/ART corpusAZ-II annotation guidelinesCoreSC Annotation GuidelinesPrague Czech English Dependency TreebankYahoo! Local categoriesYahoo!'s local listings in ChicagoPAROLE LexiconC-ORAL-ROM - Integrated Reference Corpora for Spoken Romance LanguagesCorpus LE-PAROLECorpus CINTIL-PREPLEXOSCorpus CINTIL - Corpus Internacional do PortugusEPAC corpusOnline Transcription Tool (OTTO)MUC7TNumber Sense Disambiguation annotations of the Enron CorpusEnron CorpusProppOntostand-off annotation proposed by ISO committee on Language Resources ManagementWebLichtProppian fairy tale Markup Language (PftML)The EDR Electronic DictionaryYahoo!-TREC Question CorpusA system for automatically identifying changes in the semantic orientation of wordsDOOM, Romanian Lexical Data Bases: Inflected and Syllabic Forms DictionariesTREC evalMeSHNLGbAseGoogle Search APIYahoo Search APIFastKwicNPCEditorRen-CECpsRen_CECps 1.0Corpus for Verbal Intelligence EstimationRomanian Generative LexiconPsyCoL Maltese Lexical Corpus (PMLC)Broadcast audioSample of ANC Annotated for IdiomsEMM NewsExplorerEMM NewsBriefSrpFSDGALE Phase 5 Chinese Parallel Word Alignment and Tagging Part 1GALE_Chinese_WA_tagging_guidelinesGALE Phase 4 DevTest Chinese Word Alignment and TaggingGALE_Chinese_alignment_guidelinesGALE Phase 4 Chinese Parallel Word Alignment and Tagging Part 1The Indiana Cooperative Remote Search Task (CReST) CorpusElicited Imitation Test Item Development ToolMulti Layered Hindi Dependency TreebankAnnotation Manual for Evaluation of Agent DialogueThe CMU pronouncing dictionaryICT corpus for speech recognition evaluationWSJ acoustic and language modelsCambridge HTK, HDecodeCMU SLM toolkitSRI Language Modeling ToolkitCMU SphinxNICT Kyoto tour guide dialogue corpusCorpus of Editorials from Newspapers published in Nepal and worldwide and annotation of arguments and opinionsAnnotation scheme or semantic tagset,Enqute Socio-Linguistique Orlans (ESLO)IITKGP Text Emotion CorpusPerugia Corpus (PEC)Dictionary of Italian CollocationsAn annotation scheme of modality in a broad senseA Japanese corpus annotated with labels in a scheme of extended modalityEnglish - Oromo Parallel CorpusCycloneUKWACRelProp Adjective CorpusMultiUNXipFrAG (French Annotation Grammar)Multilingual corpus for Opinion MiningQuaero QA corpusXerox Incremental Parser (XIP)Polish wordnet, plWordNet (S?owosie?)SuperMatrixClarin Web Services at Wroclaw University of TechnologyMulti-APIpepr Framework - Process Engine for Pattern RecognitionEnabling Minority Language Engineering (EMILLE)Q-WordNetJapanese to English MT rules for Multiword Functional ExpressionsJapanese hierarchical Ontology for Multiword Functional Expressions TsutsujiMeaning-Text Theory (MTT)Corpus of Meaning-Text Structures (CoMTeS)context sensitive variant dictionarySextantPerLexSpecies2000Evaluation des Systmes de Transcription enrichie d'missions Radiophoniques - ESTERBREFThe Nijmegen Corpus of Casual French - NCCFrBulgarian FrameNetCMU Pronunciation DictionarySUPPLEMuNPExMiniParPAXEnglish-Latvian localization TMEMEA - European Medicines Agency documentsLatvian news corpusLeipzig Corpora CollectionThe DGT Multilingual Translation Memory of the Acquis Communautaire: DGT-TMConnexor parserSORA corpusPAROLE sottinsiemeASC-ITPAROLE-SIMPLE-CLIPSCPA-It Italian Pattern DictionaryLT World OntologyHeart of GoldCorpus of Arabic SpeechAsian WordNetMultilingual voice creation toolkitLT4eL terminological lexicons in IT domainLT4eL ontology in IT domainLT4eL multilingual corpora in IT domainTurin University TreebankEuroparl Parallel CorpusAlpino TreebankAriadne Corpus Management SystemMaNaLaGeoQueries250RestQueries250extSVM-lighT-TK 1.2RestQueries250GeoQueries250extGerman Idiomatic PNV-Triples in Context (GIPIC)Duden 11 / Redewendungen: Wrterbuch der deutschen IdiomatikFrankfurter Allgemeine Zeitung, FAZArabic Tree BankRelExStanford English grammatical relation extraction utilityStanford part-of-speech taggerCharniak and Charniak Johnson ParserTripAdvisor Data SetWordNet-AffectWordNet DomainsSlashdot Comments CorpusHamshahriBijankhanYamCha: Yet Another Multipurpose CHunk AnnotatorSemantic Case Frames of EnglishA part-of-speech tagger for EnglishHunPosTokenizerTwinityGossip ontology and celebrity DBfurniture ontologySPAS (Structure and Point Annotation of Stories)Dutch corpus for abbreviation detection and resolutionCorpusSearch2Syntax-oriented corpus of Portuguese Dialects - CORDIAL-SINSimple Parser for HindiA Multi-layered/Multi-representational Treebank for HindiCornerstonePropbank frameset filesBrandeis Annotation Tool (BAT)Spanish SpeeCon databaseOpenLogosReEscreveReWriterPort4NooJEng4NooJNooJ20 Newsgroups Data SetANC Manually Annotated Sub-corpus (MASC)A text corpus annotated with usage informationLearning Based JavaProPOSEC: a Prosody and POS annotated Spoken English CorpusDependency based Transfer rulesThe Ellogon language engineering platformAnnotation tool for bilingual aligned corporaeg-GRIDS+Annotation Schema for Collocation Errors in Learner CorporaGALE 5W Distillation EvaluationCROVALLEXItalian BARTShabdanjaliDaniel PipesIIIT-TidesSRILM - The SRI Language Modeling ToolkitAddicterAgriculture domain parallel corpusEmilleCICC Indonesian Basic DictionaryGMA (Geometric Mapping and Alignment)Belgisch Staatsblad corpusVerb Lexicon for Second Language LearnersC-Comparator v0.23MIRE based shallow parserMorphosyntactically annotated Greek corpusGreek Dependency TreebankILSP Dependency ParserILSP LemmatiserILSP Text Simplification toolILSP Term ExtractorILSP ChunkerTimeEL corpusILSP FBT TaggerTimeELperson name ontologyNew Stuttgart Radio News CorpusTREC09_ChatArabic VerbnetMinimaPraat version 5.1TIMITHidden Markov Model ToolkitPrague LabellerCelex EPW (English pronunciations)Kachna L1/L2 Picture Replication CorpusSentiWordNet 3.0KORAIS speech databaseGerManCParallel text-image french news corpusFrameNet to WordNet mappingAdaptation of the David Chiang's (2000) STIG parser (eg. Hybrid)POETICON CORPUSWorld Wide Arabic corpusBank of Russian Constructions and ValenciesItalian TimeBankPAROLE-SIMPLE-CLIPS PISAAnCoraARTiFactOZayaPersian Linguistic Database (PLDB)FarsNetSTeP1PeykarehTranslated Wikipedia InfoboxesFairy tale corpusThe ABLE biodiversity corpusThe German-Russian Parallel Corpus of Sigmund Freud’s „The Interpretation of Dreams“MindNetGalician speech corpusGalician lexiconTAUS Data Association TM PoolLinguisticaC99Usability Guidelines for Annotation ToolsQuranyDiscourse Graph BankArabic WikipediaMINELexArabic WordNetLDC2007T23 GALE Phase 1 Chinese Broadcast News Parallel TextLDC transaltion guidelineseXtended WordFrameNetCorpora of Corpus FactoryVERTLASSYGENIA CorpusGENIA OntologyALeSKo: Annotiertes LernerSprachenKorpus (annotated learner language corpus)SALSA CorpusMaJoRussian Positional TagsetBlogBusterEllogonSTeP-1-TokenizerSTeP-1- morpho analyszerSTeP-1-POS TaggerCzech Web CorpusCzech National CorpusError-Annotated German Learner Corpus (EAGLE)Lingenio Corpus ToolA2STSense Folder CorpusChinese menu corpusJAPE EditorArabic TreebankQuaero NE patent corpusTSR CorpusChungdahm English Learner CorpusMIT FlightBrowser CorpusMIT Address CorpusWami ToolkitConstrative Lexical Evaluation of Machine TranslationN-codePOS TaggerFinite State TokenizerGold Standard POS Tagged CorpusDIINAR.1Delicious datasetLT4el ontology on computingCorpus of naturally-occurring corrections and paraphrases from Wikipedia's revision historyGRISP (General Research Insight in Scientific and technical Publications)Bilingual dictionaryCLIC Corpus della Lingua Italiana ContemporaneaRACAI web serviceCTLJ-ServerEDRCroatia Weekly 100 kw Corpus (CW100)CroTag Morphosyntactic TaggerScheherazadeLatent Semantic Analysis WebsiteMetricsMATR08 development dataJava WordNet::SimilarityREBECAWoordenboek van de Drentse dialectenPKUtreebankNICT_JC_SPNIKKEI_BPJurisdicCompanyMCommunicatorSpatial Annotation SchemeBase ConceptsOntoWordNetCore WordNetDOLCE Foundational ontologyDOLCEIntelliTextRoZPPr.A.Ti.D.EuroTermBankLemmaldIceTaggerIceParserLDC Data Exploration ToolkitLDC MADCAT Management Web AppLDC Word Alignment ToolThe Web Col FrameworkLDC Machine Reading Annotation ToolBase de datos de verbos, alternancias de ditesis y esquemas sintcticos del espaol (ADESSE)MPC: A Multi-Party Chat Corpus for Modeling Social Phenomena in DiscourseMachine Reading P1 NFL Scoring Training Data (LDC2009E112)Machine Reading P1 IC Training Data (LDC2010E07)TAC 2009 KBP Evaluation Entity Linking ListTAC 2009 KBP Evaluation Source DataEster2FrameNet-Wordnet DetourVerbaLexCG treebankAutoTagTCGCorpus of Czech sentences with manually annotated clause boundariesThe Quran and Tafsir CorpusbibleSyntactic lexicon of Arabic verbsAmarakoshaCollection of Croatian Financial TextsHansard, HLT evaluation dataSystranGTMCross-Corpus ModelOALNOUN COMPOUNDS IN CZECH, ENGLISH AND ZULUkddo1kdd09cma1The PIT Corpus of German Multi-Party DialoguesFipsCoTo be definedTermExtractorWebcorpEcoLexiconOrthographic Agreement's Knowledge BaseHuman Language Tehcnology Virtual OrganizationAVE evaluation collectionsSpecies 2000wordnets in various languagesJubileeACE 2003MUC6 Annotationspair-hmm-translitGMTK-DBN-transliteration-model-scriptsJRC Quotes Collection for Sentiment AnalysisDicionrio AbertoNuance ASR/NLU GrammarDiscourse ontologyGuidelines for Caption Annotation: Notes for annotators on how to identify and ground toponym expressions in captions.KnowtatorTRIPOD Corpus of Annotated Image CaptionsANNALIST: ANNotation ALIgnment and Scoring ToolToponym Ontology Geocoding Service (TOGS)Stuttgart-Tbingen Tagset of GermanTbingen Treebank of Written German (TBa-D/Z)RPM2 Summarization and Sentence Compression CorporaGermanPolarityCluesOnline Database of Interlinear text (ODIN)MASCWorldmapperEnt2WikiL3MorphoOlympus NLU Evaluation FrameworkOntoTag's abstract architectureConnexor's FDG ParserLACELL's POS taggerBitext's DataLexicaEAGLES Recommendations for the morphosyntactic and the syntactic annotation of corporaMultilingual Question-Answer Pair CorpusMetaphorDecoOwlExporterMutation Impact TaggerMutation Miner OntologyGEMSDIPSMarathi WordnetPrinceton WordnetJWPL TimeMachine6-monthly snapshots of WikipediaUTD-MotionEventLanguage Technology Resource CenterPattern Dictionary of English Verbs (PDEV)PUMAWQueryPolNet-Polish WordnetCorpus brut amazigheCorTradA Gesture Analysis and Modeling Tool for ANVIL (GAnTool)ACLP American English Messaging Lexiconthe FreeTalk Conversation CorpusThe ESP_C CorpusTest set: TREC 10 questionsAnnotation guidelines for CoNLLTraining set 5(5500 labeled questions)NOMCO multimodal Nordic corpusCorpus of Spontaneous JapaneseBalanced Corpus of Contemporary Written JapaneseArabOrth lexiconCatib version of the Penn Arabic Treebank part 3 v3.1HornMorphoLDC Arabic-English parallel corpusAnCoraPipeArabic-English Parallel Word Aligned Treebank CorpusDan Bikel Multilingual Statistical Parsing EngineLDC2005T20Acquis , UNs , Meedan , LDC2004T17ALGASDList of questions and their answersArabic-English parallel corporaMoses Phrase-based Statistical Machine Translation systemArabic PropbankDialectal ArabicMILA Hebrew CorpusDialectal Arabic ResourcesKawakibCWBMila morphologial analyzerResource grammar in GFARALEXThe Essex Arabic Summaries Corpus (EASC)DIINARMaltParsrNext Generation Localisation Process MapCorpus IDVIULA Text HandlerBrills TaggerFpgrowth Algorithm ImplementationDANTEDanNet , Arab WordNetVocon3200 BasqueZT CorpusaAnHitzDlgMatxinElezkariAhoTTSTANL (Text Analytics for Natural Language)UIMAWordFreakOpenNLP ToolsAWAdbGlozzannie-rdfannie-alphachunking-synaf-enmaf-enTEI (Text Encoding Initiative)IFrameILCI corporaOpen-Content Text CorpusFreeDictOntologies of Linguistic Annotation (OLiA ontologies)RevLemISOcat Data Category Registry (DCR)LIR/Lexical Information RepositoryMLIF/MultiLingual Information FrameworkISOCatSALTGrAFFSR - Feature Structure Representation (ISO 24610-1)ISOcat Data Category RegistryLEXUS, ToolboxXLIFFISO DIS 24612 Language resource management - Lingusitic annotation framework (LAF) and other ISO standardsLwazi TTS corpusLwazi ASR corpusIgbo CorpusLuo Part-of-Speech TaggerParallel CorpusRules for annotating VPs of Northern SothoLanguage Identification for eleven South African LanguagesSwahili-English parallel corpusCTexT Alignment InterfaceParallel Text Corpora for 3 SA language pairsANERcorpSYNERGYHelsinki Corpus of SwahiliKamusi ProjectUIUC Learning Baed Java (LBJ) Named Entity TaggerDagbani CorpusG?k?y? CorpusLwazi ASR CorporaLDOS-PerAff-1Corpus of negociation between users and virtual charactersOPENCV Processing and Java LibraryAnvilOpenCVDYNEMO: A corpus of dynamic and spontaneous emotional facial expressionsD64 Multimodal Conversational CorpusNordic multimodal Corpus (NOMCO)The USC CreativeIT DatabaseSpeech & Prosody SegmentationAudio-Visual Corpus (3D Faces and Speech)3D Face TrackerDigital Replay System (DRS)TimelineCIDCALLAS gesture expressivity corpusInSight InteractionUTEP-ICT Cross-Cultural Multiparty Multimodal Dialog CorpusDOMESemaine DatabaseSAL DatabaseEmomultidisciplinary medical meetingsFilMED - Filipino Multimodal Emotion DatabaseHuComTech Multimodal (Audio-visual) DatabaseSaGATKKVICLODanPASS dialoguesHead Pose and Eye Gaze dataset HPEGMozilla Semantic DesktopArguMeetDUC 2007 DatasetMobySNOMED CTConcord-EDOthersRPAH-ICUDisease and Adverse Effect CorpusPubMed Stopword ListChinese Medical Subject HeadingsLemur ToolkitTREC Genomics 2006 and 2007谷歌金山词霸2.0 (Google and Kingsoft Dictionary 2.0)GENIA EventAnonimised patient recordsBody part ontologyBulgarian DictionaryColorado Richly Annotated Full TextEstonian WordNetBalkaNetSPSSSALDOThe Specialist LexiconelexikoGensimCzech Digital Mathematics Library, DML-CZMARF and its ApplicationswoofDKProRussian WordNet, Russian WordNet GridJavadoc doclet corpus generatio toolBehemothcTAKESMachine Learning for Language Toolkit (MALLET)UIMASTJULIE Labs UIMA Collection Reader for WIKIPEDIAAutomatic Annotator softwareService-Finder datauimaSolrCASCorpusPediaCRPCForeign Language Examination Corpus of the University of Warsaw (FLEC UW)Research Artcle CorpusukWaC + itWaCCross Language Translation and RetrievalChinese and English Parallel Corpus Extracted from Comparable PatentsWortschatz Universitt Leipzig - Corpora and Language StatisticsComparable Corpora for EU LanguagesACCURAT Initial Comparable CorporaIrish Sign LanguageARPToolkitSpanish-LSE corpusSign Language Pose Estimation 3D Parse TreesRWTH Phoenix Weather ForecastAVATecH databaseAVATecH ApplicationSigns of Ireland CorpusItalian Sign Language CorpusSignCom corpusOnno CrasbornAmerican Sign Language Lexicon Video DatasetDGS-CorpusCopyCat CorpusData Collection PlatformSign verification systemLabel ToolNational Center for Sign Language and Gesture Rescoures (NCSLGR) corpus, Boston UniveristyAuslan CorpusCUNY ASL Motion-Capture CorpusRussian Sign Language Explanatory DictionaryThe basic dictionary of FinSL example text corpus (Suvi)MoodleILexJASigningSILADON_CZSigning footage recorded from TV with simultaneously broadcasted subtitlesOCELLESRWTH-BOSTON-400CatCGCatalan Sign Language Corpus on the weather report domainNIST score (script: mteval-v13a.pl)am_toolsCorpus NGTJulie Hochgesangdasher for sign writingDicta-Sign : API for pluginsGSL Classifier CorpusSIGNUM DatabaseAmerican Sign Language SynthesizerAugmented Reality Authoring Tool with Online Information Integration PlatformQuestion Analyzer for RomanianThe Eurogene corpusThe Eurogene ontology of human geneticsMyanmar Word TokenizerMyanmar Name Entity RecogniserSense Tagged CorpusEnglish-Myanmar Parallel Text alignerCollective Named EntitiesName Matching Evaluation FrameworkSxPipe/NP2evalSxPipe/NPENCOREBART Anaphora Resolution ToolkitACE-2 Version 1.0LBJ Coreferene PackageMIX1Asymmetric Threat Response and Analysis Program (ATRAP)BoB dialogue logsYahoo! Answers Comprehensive Questions and Answers corpus [version 1.0]Question Answer Sentence PairExcite query logWikipedia article namesAnnotated Webclopedia question collectionTREC QA questionsmulti-XEN1MSzeged CorpusHungarian Webcorpusmorphdb.hunamed entity taggersRipperFodina Patent TextsChoose the Right Word Formality PairsHunmorphFormalism for Morphological and Phonological GrammarMPI Language Resource ArchiveStephen Jay Gould ''Leonardo's Mountain of Clams and the diet of Worms.'BAMDESSpoken Turkish CorpusLeffeeuromobil 2LexkitGeorgian Ontological Semantics LexiconNetgraphIndex Thomisticus Valency LexiconGuidelines for the Syntactic Annotation of Latin TreebanksAnnotations at analytical level: Instructions for annotatorsHFST finite state speller evaluationWikipedia dump and scriptsOpen source morphology for Finnish (omorfi)Nganasan LexiconTypeCraftSoraLexHas not been named yetTnTCombiTaggerfnTBLBidirectional taggerCorpusTaggerThe Icelandic Frequency DictionaryIceNLPUrdu data annotated for EmotionsmwttoolkitGreek MWE dictionaryA Georgian-Russian-English-German Valency Lexicon for Natural Language ProcessingJRC-AcquisSouth-East European TimesRomnaian syllables data baseLexParJRC-ACQFrRoFRROAUFDGT Translation MemoryHungarian WordNetMagyar rtelemez? Kzisztr / Hungarian Explanatory DictionaryFrequency Dictionary of Verb Phrase ConstructionsLegalPrivacyOntoPopulateAnnotated Intellectual Property Claims in Complaint DocumentsSwiss statutes and regulations (semantically analyzed)Controlled Legal German (CLG)German court decision corpusLegal Case Factors ExtractionFAU AiboopenEARSmartKomSALVera-am-MittagMMI Facial Expression DatabaseSpeech In Minimal Invasive Surgery (SIMIS)EmoVoxNao-ChildrenIDVSEMAINE corpus headnods and shakesRECC-Rovereto Emotion and Cooperation CorpusVocal Expressions of Nineteen Emotions across Cultures (VENEC)Mind Reading(no-name)DoorsWordNet-Affect-OCCDigg datasetBBC News forums data setEmotional Narratives CorpusMaltOptimizerPostech Learner Corpus (POLC)a Collection of Translation Error-Annotated Corpora (Terra)A Large-Scale Unified Lexical-Semantic Resource (UBY)A Library for Large Linear Classification (LIBLINEAR 1.51)A Multi-layered Reference Corpus for German Sentiment Analysis (MLSA)A Multiparty Multi-Lingual Chat Corpus for Modeling Social Phenomena in Language (MMPC)A new French Meta Grammar (frmg)An Open Toolkit for Automatic Machine Translation (Meta-)Evaluation (Asiya)an XML Based System For Corpora Development (CLaRK)Annotation Tool for Concepts and Relations (Recon) ???Arabic GigawordArabic Treebank Part 3 v 3.2Arabic-English Parallel Aligned Treebanksasa-sentiment-analysisAtlante Sintattico d'Italia, Syntactic Atlas of Italy (ASIt)Austrian Academy Corpus (AAC)BAS Bavarian Archive for Speech Signals Pronunciation Lexicon PHONOLEXBasic dictionary of FinSL example text corpus (Suvi)BRIGHAM YOUNG UNIVERSITY British National Corpus (BYU-BNC)CALBC (Collaborative Annotation of a Large Biomedical Corpus) corporaChinese Spell Checking DatasetComponent MetaData Infrastructure (CMDI) Information PageCoreference Resolution Engine (Reconcile)Corpora from the web (COW)Corpus DEFT (Dfi Fouille de Textes)Corpus Internacional do Portugus (CINTIL) PropbankCroatian Inflectional Lexicon (MOLEX) Cross Language Entity Linking in 21 Languages (XLEL-21)Dependency Part of BulTreeBank (BulTreeBank-DP)Dictionnaire de valence des verbes franais (Dicovalence 2)Dictionnaire fondamental de l'informatique et de l'Internet (DiCoInfo)Drugs@FDA Data FilesEmotional Speech Database for Basque (Ahoemo3)English Child Language Data Exchange System (CHILDES) Verb Construction DatabaseEuconstFlexible Error Annotation Tool (feat)Format for Linguistic Annotation (FoLia)Free Text File Merging Tool (TXTcollector)genchal-repositoryGeneral Ontology for Linguistic Description (GOLD)German Web Corpus (DeWaC)Glossary of International Relations (GLOSSIR)Helsinki Finite-State Technology (HFST) toolsInternational Workshop on Spoken Language Translation (IWSLT) 2011 parallel TED CorpusJava library for detecting Multi-Word Expressions (jMWE)Joint Research Centre (JRC) Eurovoc Indexer JEXKTH eXtract Corpus (kthxc)Lancaster-Oslo/Bergen (LOB) Corpuslanguage generation evaluation toolkit (lg-eval)Large Bilingual Speech Database for Synthesis (Ahosyn)Learner Corpus of HungarianLexicon Enhancement via the GOLD Ontology (LEGO)LGTaggerMAZEA-WebMITRE Dialogue Kit (MIDIKI)Modeling linguistic corpora in OWL/DL (POWLA)Modern Arabic Representative Corpus 2000 (MARC-2000)MorphoAdornername recognizerMultilingual Turin University Treebank (ParTUT)Multilingual UN Parallel Text 20002009 (MultiUN)Multi-perspective question answering (MPQA) sentiment lexiconNear-Identity Relations for Coreference (NIdent) CA Official Europarl test set from WMT 2008pantera-taggerParallel Corpora Collector (PaCo2)Parameterized & Annotated CMU Let's Go Database (LEGO)Persian Treebank (PerTreeBank)Perspicuous and Adjustable Links Annotator (PALinkA)PORT-MEDIA DomainRecognising Textual Entailment (RTE) 2 Test SetReference Corpus of Contemporary Portuguese (CRPC) Modality SampleSerbian morphological electronic dictionary (SrpRec)Serbian Wordnet (SrpWN)SignWriting improved fast transcriber (SWift)Similarity Metric Library (SimMetrics)SMI Remote Eye Tracking DeviceSpeech Database in Basque for Synthesis and Voice Conversion (AhoSpeakers)SPeech Phonetization Alignment and Syllabification (SPPAS)STEVIN Nederlandstalig Referentiecorpus (SoNaR)Swedish Kelly listSyntactic lexicon for French (Lefff)The AQUAINT Corpus of English News Textthe BiLingual Annotator/Annotation/Analysis Support Tool (Blast)The Concisus Corpus of Event SummariesThe DGT Multilingual Translation Memory of the Acquis Communautaire (DGT-TM)The Freiburg - LOB Corpus of British English (FLOB)the open parallel corpus (OPUS) UK PubMed Central (UKPMC)Verb Pattern Sample, 30 English verbs (VPS-30-En)W2C - Web To CorpusWebCrawlerWeb-Harvested Corpus Annotated with GermaNet Senses (WebCAGe)Word clAss taGGER (WAGGER)WordNet Libre du Franais (WOLF)Sentiment QuizEmpoli e dintorniProposition databaseAssociation NormsYamabukiThe Database of Icelandic Inflection [Beygingarlsing slensks ntmamls]InputlogPimorfoCasual English Generation Phoneme DatabaseInfomap NLP SoftwareHTTrack Website CopierRNgram Statistics Package (NSP)HTMLAsText v1.11GSplit 3ABBYY FineReader 10TermoStat Web 3.0metu sabanc? Turkish Dependency Treebank - addtional annotationLeXimirWordNet AtlasSUTimeERDOEdit PluseXtended WordNet DomainsTurkish Word SketchesDutchSemCorCallistoFrench corpus of event nominalsOldpress CorpusLFG Grammar of PolishSemantic TypesDependency-Parsed FrameNet CorpusA Reference Dependency Bank for Analyzing Complex PredicatesBoundary-Annotated Qur'anAncora-3LB-POSLAST MINUTEYahoo Chiebukuro Corpus Humor IndexAnnotated UGC corpus for normalizationWeSearch Data Collection (WDC)SMALLWorldsMandarin Chinese GrammarCroatian Dependency TreebankInterCorp - a multilingual parallel corpusDSimAWATIFTARSQI ToolkitLGLex 3.3Aesops fables and Andrew Lang fairy tales collectionColloquial Egyptian ArabicFAUST Feedback AnnotationOral History Annotation ToolAnnotations for progressive aspect sentences in the spoken section of the BNCA Cross-Lingual Dictionary for English Wikipedia ConceptsSocial Constructs - Pursuit of powerCapek: an annotation editor for schoolchildrenSubjectivity Lexicon for Dutch AdjectivesGerman Food Relation DatabaseChinese Whispers Paraphrase Corpus (CWPC)FAUST quality assessmentsTIGER Treebank (release August 2007)German Parliament SessionsSemSimANALECPrague Czech English Dependency Treebank 2.0PathoJenFTW multi-speaker synchronous acoustic and 3D facial marker data in Austrian GermanModeling Textual Organization (MTO) CorpusCatalan WebcorpusNTCIR-9 SpokenDoc test collectionVerb Lexicon and Event DurationsJAKOB-LexikonEmail corpus annotated with social power relationsDramaBank corpus and Scheherazade annotation toolA Sentence Database for Chinese Reference GrammarPrimeCoefBlademistress corpusDirectional corpora in EuroparlA Universal Part-of-Speech TagsetTAC 2009 KBP Gold Standard Entity Linking Entity Type listEnClueWebArabic Treebank (ATB) Part 3 v 3.2Casual English Conversion System DatabaseTrilingual Parallel (Arabic-Spanish-English)Corpus-SampleISO 24617-2 Semantic annotation framework, Part 2: Dialogue actsPropbank-BrhandAlignedCorporaJPENTWSI 2.0: Turk Bootstrap Word Sense InventoryDiachronic German Corpus TBa-D/DCArabic Subcategorization Frames in the Arabic TreebankYADACTED-LIUMCLASSYN text type-specific corporaCorpus of Pronominal Anaphora of the QuranPoliMorfKPWrRussian Automotive CorpusThe Herme Database of Spontaneous Multimodal Human-Robot DialoguesdeepKnowNetCATGerman political news corpusMulti-perspective question answering corpus (MPQA)Birkbeck spelling error corpusSpanish C-ORAL-ROM-ELEQurSimSFU Review corpusPortal Lngua PortuguesaRWTH-PHOENIX-Weather CorpusRIDIRE-CPICLIMBThe Hindi PropBankTRIS CorpusGrammatical Framework (GF) Resource Grammar LibraryMultilingual Central Repository version 3.0Quranic Arabic CorpusDECODAOntology of Italian LinguisticsSquoia TreebankLinguistic Linked Open Data cloudNgramQueryAhoDiarizeHal Scientific Paper Corporaapertium-kircSRIDENTIC CorpusWordNet mapping to Kyoto OntologyAnnotation Discursive (ANNODIS) CorpusFinnish WordNet (FinnWordNet)Connective annotation over EN/FR EuroparlLatvian resource grammar in Grammatical FrameworkIT-PANACEA SCF test suiteCorpus of Sentences for User Interaction in Pronunciation Learning SystemsMicrosoft Researc Lab India's Hindi-ENglish Transliterated Song Lyric DataThe REX corporaCARDS-FLYMo Piu data baseMultitradPolish Sejm CorpusIntegrated Reference Corpora for Spoken Romance Languages (C-ORAL-ROM)Chronolines CorpusWebAnnotatorDicoInfoGene renaming corpusapertium-es-anROMBACNoor Book CorpusGold Standard for English human nounsJWNLSimpleULexThe Twins Corpus of Museum Visitor Questionsquestioncorpus.ptInternational Corpus of Learner EnglishSwedish Framenet (SweFN)Large Scale Syntactic Annotation of written Dutch (LASSY)Quaero Terminology Extraction Evaluation Patents CorpusTrsor de la Langue Franaise informatis (TLFi)FixISS databaseCorpus of Computer-mediated Communication in Hindi (CO3H)Le Petit Prince in UNLPORTMEDIA LangBulgarian X-Language Parallel CorpusData repository of spontaneous spoken CzechCroatian Valency Lexicon of Verbs (CROVALLEX)goo300k corpus of historical SloveneETAPEBulgarian National Reference CorpusRomanian TimeBank corpusMIMEstonian reference CorpusBrandPittMULTIPHONIAMASC word sense corpusTAC 2010 KBP Evaluation Source DataExample Database of Japanese Multiword Functional ExpressionsUniDic-2.1.0RELcat a Relation RegistryAttribution DatabaseEllogon language engineering platformFirst Certificate in English (FCE) exams of Cambridge Learner Corpus (CLC)Dot object predication gold standardSimcoach speech synthesis evaluation corpusAustralian National CorpusInforexLexItDutch Parallel Corpus (DPC) (subpart)Irony detectionText::Perfide::BookSync (Perl module)TIMENTajik-Farsi Persian Transliteration SystemStockholm MULtilingual TReebank (SMULTRON)Suffix Tree Language ModelGisterTreebank.infoa Beautiful Anaphora Resolution Toolkit (BART)EMO_EventsNTCIR-7 Patent Mining dataXFST Murrinh-Patha morphological analyzerRembrandt frameworkProsomarkerAnnotation of instructional textsNeoTagPersian Part of Speech TaggerTurkish Paraphrase Corpustexrex web corpus toolsuima-commonCorpus of Indefinite UsesSeCo-600AledaStockholm EPR Clinical Entity CorpusGerman Logical Metonymy DatabaseAutomotive Repair OrdersWES baseThe LarKC life sciences datasetGlottolog/LangdocRomanian WordnetAnnotated Film Dialogue CorpusEPIC Twitter NLP Development DatasetChiba three-party conversation corpusPiTuLetsMT!HunOrQuaero Football CorpusALLEGRAmate-toolsSciTexTimeBankPTEst Rpublican Parsed CorporaLyrics&NotesThe Nordic Dialect CorpusSzegedParalellFXDEGELS1MLTaggerApertium Spanish Monolingual Dictionary from Spanish-Catalan language pairVirtual Language ObservatoryAnItaPolarity lexicons in SpanishConanDoyle-neg corpusSpeech data corpus for verbal intelligence estimationEstonian Multiparty dialoguesSimplextBaltic Language Named Entity Recognition (NER) corpusNomcoI3Media multilingual emotional speech corpusPOLEXPESHeidelTimeAjkaAn Annotated, Multilingual Parallel Corpus for Hybrid Machine TranslationLDC User InterfaceCzech Web Corpus 2011UniDic for Early Middle Japanesetexto4sciencePolish Multimodal CorpusDeCourANVIL toolPETSemeval-2010 Japanese WSD task datasetPRONTO Firefighter CorpusQuestion Pairs from WikiAnswersmaui-indexerPEXACCSentiStrengthNKI-CCRT CorpusEnriched GENIA Event Corpusiula2standoffMinho Quotation ResourceCultural Heritage item - Wikipedia matchGerNEDB3-toolCorpus on Debate and DeliberationJoint Research Centre JRC-Acquis German-EnglishCzech-English Parallel Corpus (CzEng) 1.0Divergence Measure Tool (DMT)Customized Europarl corpusCLTC corpusMetadata editorLarge Dataset of French-English SMT Output CorrectionsC-ORAL-BRASIL IWiktionary lexical networkHOO Evaluation FrameworkCroatian collocations gold setPolish WordNet (plWordNet)Elhuyar Basque-Chinese predictionaryWOLFT2T3Phase One data release for Blizzard 2012Arabic Wordlist for SpellcheckingTamil Dependency Treebank (TamilTB)Netlog Corpus and Chatty SubcorpusBritish National Corpus (BNC), spokenKIT Lecture Corpus for Speech TranslationAn Open Source Persian Computational GrammarISABASE - 2Spatial Containment Relations Between EventsYet Another Term Extractor (YATE)CALBC CorporaNational Center for Sign Language and Gesture Resources (NCSLGR) corpus, Boston UniversityThe Switchboard CorpusRelaxCorDamascene Colloquial Arabic SpeechAncient Greek Dependency TreebankEuroparl v.6Wikinews Multidocument Summarization Data Collection ToolThe Stanford Parser: A statistical parserTwitter Emotion CorpuseXtensible MetaGrammarTIGERexpSVMlightA Collection of Russian Corpora (University of Leeds)SrpRec - Serbian morphological electronic dictionary (SrpRec)Aligned ConceptNetsStatistical Engish-Myanmar Machine Translation SystemTsinghua Chinese Treebank (TCT)English-Croatian Parallel Corpus (EngCro)Corpus QuaeroPalula CorpusDragon Naturally SpeakingPhonologie du Franais contemporainTemporal Entailment Rules for RDFS and OWL-Horst dialectAnnotated Corpus of Automotive Engineeringnews articles on epidemicEvent levelsJapanese Word Dependency CorpusOntoverbetymosJoint Research Centre (JRC) Quotes Collection for Sentiment AnalysisAutomatic annotated training data for temporal slot fillingDaemonetteSDEWACGreek-Chinese Interlinear of the New TestamentGreek POS TaggerNorth American News Text Corpus (LDC95T21)Icelandic Parsed Historical Corpus (IcePaHC)ELRA-W0051Language Similarity TableEmotiWordEnglish-Russian Wiki-dictionaryTUNA-Lex corpusWeb Service Architecture for AlignersAVATecH BAtch eXecutor ABAXLibrary of Natural-Language Representations of Formal RelationsBilingual LexiconJezikovne tehnologijeSPIDIXSweVocOSCARRobertino-game corpusCORIS/CODISLignocellulose CorpusNLPipethe Callhome Mandarin Chinese CorpusTalkBankThe Kalashnikov 2K dependency bankLitRecAssociative Concept Dictionary (ACD) for VerbsEpinions Annotated Reviews DatasetBulgarian Sense Annotated CorpusDutch de/het noun classification script for TiMBL/Frog outputStandford ParserJapanese Corpus of Diverse Document Leads with Anaphoric AnnotationCAMELEON comparable corpusUighur to Chinese Dictionary DatabasehunspellEventMapping.rdfcorpus_shellThe KPG English CorpusYet Another Multipurpose CHunk Annotator (YamCha)Croatian Sentiment LexiconLeuven Arabic DatabasePashto monolingual corpusOld Hungarian CorpusIMAGACT Annotation InfrastructureKinOathMathHindi Discourse Relation BankCroatian Morphological LexiconFips Web ServiceRadziszewski Acedanski Tagging Evaluation MethodTagged Quranic CorporaSETIMESVietnameseNERCroatian CG Morphological Disambiguation RulesSequence-based alignerBiomedical why-question answering corpusThe Portuguese Regional Accent DatabankHausa Internet CorpusMultiword Expressions Toolkit (mwetoolkit)Spoken Corpus of Standard European PortugueseLM/PRUDouble-Edged WordNet (DEWN)ConceptNet 5Multiple-Chinese Translation Part 4 (MTC4) datasetFrench preprocessing OpenNLP ModelsEnron questins and answersAmerican National Corpus (ANC) Manually Annotated Sub-Corpus (MASC)LEXCONNTurkish Dependency ParserCorpus of evolution of designation of events in FrenchAnnotation guidelines for event nominal tagging?wigraNIdent-ENDELAAesops fables timeline annotationsGerman (P)LTAG treebankOpinionFinder Subjecitivity LexiconTiGer2DepCroatian WebcorpusBonsai Dependency Parser v3.2MoCap ToolboxTAC 2010 KBP Evaluation Entity Linking Gold Standard V1.0Ontonotes 4.0 newswire sectionHTML and PDF text and metadata extractorsMorfeusz SGJPCELCT CorpusModal sense corpusC-ORAL-CHINAKyoto University Case FramesSvenskt associationslexikon (SALDO)OpenSubtitlesSpanish Learner Language Oral Corpora (SPLLOC)CeramicaSpanish Aragonese DBpedia Abstract CorpusGold Standard for English abstract nounsLang-8 Learner CorpusGrETEL (Greedy Extraction of Trees for Empirical Linguistics)Quaero Terminology Extraction Evaluation Abstracts CorpusDependency Shift Reduce parser (DeSR)Extented DOLCE OntologySYNC3 Collaborative Annotation ToolIMS Open Corpus WorkbenchChinese-English Parallel Aligned TreebanksUser generated content out of automotive forumsIrish Dependency TreebankBaboukPledari GrondApertium Catalan Monolingual Dictionary from Spanish-Catalan language pairA Corpus of Spontaneous Multi-party Conversation in Bosnian Serbo-Croatian and British EnglishTartu Multimodal DataCARTOLACrisis Management CorpusElhuyar Basque-English dictionaryT2T3 TIMEX3 corporaInternational Workshop on Spoken Language TranslationBuckwalter Arabic Morphological AnalyzerFrown corpusPre-processed test corpus of RussianSpanish WikipediaCorpus Analysis and Validation for TimeML (CAVaT)Estonian Morphological DisambiguatorWebCAGePROIELIllinois Named Entity TaggerMyanmar-English-Myanmar DictionaryCorpus de Franais Parl ParisienDisjoint EventsModern Greek Spontaneous Essay TextAVATecH wrappers and utility recognizersMicrosoft Research Paraphrase Corpus (MSRP)Electronic Orange BookThe Wenzhou Spoken CorpusChild Language Data Exchange System (CHILDES)TAC 2008 Update SummarizationAligned Annotation ToolKazakh to Chinese Dictionary DatabaseAQUAINT-2TimeBank 1.2WordAlignerHumorWMBTILCI Annotation ToolEuropean Medicines Agency (EMEA) documentsInterrogatio AnselmiOpen American National Corpus (OANC)MorfeuszGerman (P)LTAG lexiconThe New York Times Annotated CorpusCzech WebcorpusEmospeech MPCorpusTAC 2010 KBP Training Entity Linking V2.0SentiSenseMorfologikAnnotated corpus of German political newsC-ORAL-JAPONSVMToolReview corpus annotated for speculation and negationTalbankenMicrosoft Video Description CorpusSpanish Learner Oral CorpusTxtCeramGold Standard for English non-deverbal eventive nounsConverter Freeling 2 DESROntoValence DictionaryPolish Parallel CorporaTimenEvalCWB-treebankIllinois Part Of Speech TaggerApertium Bilingual Dictionary from Spanish-Catalan language pairWikiWarsGermanSentimentDataMdbg Chinese-English dictionaryWapitiLEMLATIMAGACTJava Wikipedia Library (JWPL)Danish Dependency TreebankGramTransSentiment-annotated set of quotationsMyanmar Word SegmentationChoix de Textes de Franais ParlParagraph alignment list (LINA-PAL-1.0)Paraphrase CorpusRobertino-gameAUDIMUSIntra Chunk Dependency ParserKyrgyz to Chinese Dictionary DatabaseFR-TimeBankCroatian CG Morphological TaggerText::NSPDutch WebcorpusLexicon-Grammar tablesTAC 2011 KBP English Evaluation Entity Linking Annotation v1.1Gold Standard for Spanish concrete nounsLEXUSTIPSemuima-connectorsIllinois ChunkerTime4SMSGermanSentiStrengthLexiconSCOLA broadcast newsBasque-Chinese comparable corporaMorphTaggerLexicon totius latinitatisSRI Language Modeling Toolkit (SRILM)Corpus de Rfrence de Franais ParlParagraph Clustering List (LINA-PCL-1.0)ManhattanChinese to Uighur, Kazakh and Kyrgiz Parallel Dictionary DatabaseA2ST-COMPWeighted Lexicon of Event NounsUCS toolkitEnglish (P)LTAG treebankItalian WebcorpusTAC 2011 KBP Cross-lingual Training Entity Linking V1.1Bulgarian Morphological Dictionary XML editorGold Standard for Spanish human nounsWeb 1t 5-gram corpus (1.1)Illinois Named Entity RecognizerTime4SCIInstructional ManualsConnexor Machinese Syntax parserScorerRiTa.WordNet a WordNet library for Java/ProcessingLocalMaxsEnglish (P)LTAG lexiconPolish WebcorpusTAC 2011 KBP Cross-lingual Evaluation Entity Linking Annotation V1.1Gold Standard for Spanish semiotic nounsTimeMLIllinois Coreference ResolverNUS SMS CorpusLDC HUB4FEMA: Help After a DisasterAntconcPropBank (Proposition Bank)Spanish WebcorpusGold Standard for Spanish non-deverbal eventive nounsIllinois Semantic Role LabelerCrowd-Key-phrasesGovernment RecordsContextesNoClonePenn Treebank NP annotationIllinois Wikifierearly Government RecordsA free/open-source marker-driven example-based machine translation system (OpenMaTrEx)A Treebank for Finnish (FinnTreeBank)Automatic Syntactic Analysis for Polish Language (ASA-PL)GNU Linear Programming Kit (GLPK)Joint Research Centre JRC-AcquisPolish Word SketchesSoftware for Clustering High-Dimensional Datasets (CLUTO)Squoia SpellcheckSyntax in Elements of Text (SET)Topic Detection and Tracking (TDT Phase 3)Lin-EBMT^REC+The Survey Zone (SurZe)RACAI TTS Speech SegmentationmyresPLLMpre-test, training, post-test experimental designACE 2004 Multilingual Training CorpusPersian Syntactic Verb Valency LexiconFipsNormalizerRoGERGeneralLexicon_Fr-EnPerson Name Recognition for Alpine TextsIndian Language Part-of-Speech Tagset: HindiPrinciples of Part-of-Speech (POS) Tagging of Indian Language CorporaPhrase Structure GrammarEuroparl v6 French-EnglishKorpusik US II PWr (Small Corpus for WSD)ToTaLePOS tagged Data setOntoGenKashmiri Part of Speech TaggerBLEU/NISTPronunciation Errors from Learners of English Coprus and Annotation (PELECAN)A Chinese-English Code-Switching Speech Database (CECOS)Fisher English Training SpeechDysarthric Speech DatabaseBengali Speech CorpusMongolian speech corpus for speech synthesis (NUM, NITP, NECTEC)Multimodal corpus in multi-party conversationsMalay Emotional Speech Database (MESD)Buckwalter lexiconApertium Spanish monolingual dictionary from the Spanish-Catalan language pairIndiana Cooperative Remote Search Task (CReST) CorpusTask-10 test and training data Semeval 2010Bijankhan CorpusFeauture Vector Set and Tree Forest (SVM-LIGHT-TK 1.2)PRObability-based PROlog-implemented Parser for RObust Grammatical Relation Extraction System (Pro3Gres)Lingua::TreeTaggerCorpora for Named Entity Recognition of Chemical Compounds (IUPAC Corpus)Domain Adaptive Relation Extraction (DARE)HCRC Map Task CorpusIcelandic Frequency DictionaryJoint Research Centre (JRC) NamesMulti-Perspective Question Answering (MPQA) Opinion CorpusOpen Mind Indoor Common Sense (OMICS)Rhetorical Structure Theory Tool (RSTTool)Suggested Upper Merged Ontology (SUMO)Treebank-2 TBa-D/Z Release 7Unified Medical Language System (UMLS)Unified Medical Language System (UMLS) MetathesaurusApertium linguistic data for the Spanish--English language pairApertium linguistic data for the Breton--French language pairApertiumWikipedia Category HierarchySemantic Parser (no specific name)opinion holder predicatesRST Spanish TreebankEmotiBlogControlled Language for Crisis Management (CLCM)KPG English Learner CorpusProject GutenbergEvent LexiconIT-TempeEval-2 Data SetWiki50QuestionBankNLP Resource Metadata Questions Treebank (NLP-QT)Image description CorpusPunjabi Resource GrammarAggregated texts and their semantic representationsAymara LFG GrammarTranslation memories of O?s ar Brezhonegnewstest2008newstest2010Minho Quotation BankPublic Health Opinion Corpus (PHOC)BulTreeBankTBa-D/Z connective annotationMultilingual Named Enity Annotated CorporaNew York Times Annotated corpuswikAPIdiaPAN'10 plagiarism detection corpusi2b2 2010 corpusFrench-Romanian parallel corpusXML model of WikipediaBeygingarlsing slensks ntmamls (BN)Mrku slensk mlheild (MM)Reusable Resources for the Romanian Language (RRRL)Health-related sentiments and opinionsOntology on Narural Sciences and TechnologyWSMT corpus annotated for sentimentMultilingual sentiment dictionariesWiki-Biographic-CorpusMultilingual summary evaluation dataMaupassant: segmented and tagged text into types by XML tagsNewstrain-08CoNLL2000 shared task datasetEnglishLIperl scriptChinese German EnglishGermanJapaneseArabic Italian SpanishBengaliPolish KoreanRussian ChineseCzechManipurijava scriptNot ApplicableHindiEuropean LanguageJava SwedishSwedishFrenchIndonesianSpanishUyghur Arabic frenchEnglish; Chinese; ArabicUrdupersian2600 languages--not independent Hindi Catalan Galician BasquePortuguesemainly English ...Japanese; FrenchKorean others Bangla Telugu Chinese and Arabic Gujarati-HindiHindi-GujaratiBrazillian PortugueseIbanItalian Bengali Marathi Urdu Mandarin and ArabicSimplified ChineseTraditional ChineseDutchJapanese (and Chinese) Portuguese Finnish French and GreekEnglsihEmglish Malay JapaneseMultilingualHebrewEuropean Portuguese HaitianEnglish portionC#; JavaHungarianRomanian lots of. EnglishSouth African ZuluChinese and English isiNdebele isiXhosa isiZuluAfrikaans Burmese LaoThai RomanianTurkishAmerican Sign Language ?otherSwahili Indonesian NorwegianAlbanianBulgarianGreekModern Standard Arabic Bulgarian Byelorussian (Belarussian) Italian Sign Language Russian DutchArabic.English.CzechENSouth African English AfrikaansBasque Spanish and othersSpanish; to-be-extended to EnglishSlovenianCatalanSerbianLatin RDF-XML WSDLEnglish (95%)fr otheramazigh LatvianFilipinoNorwegian Serbian TurkishEuropean Portuguese DialectsGalician Croatian Czech Danish Senegal)Wolof (Niger-Congo Amharic K'iche' QuechuaTigrinya Estonian GreekCroatianSloveneLithuanian ChinseLSF Levantine Arabic PashtoIraqi ArabicMore than one thousand languagesSorani Kurdish English soon TigrinyaAmharicFrench Sign Language (LSF)Sanskrit de en es fiDanishdeesidit22 european union languagesGerman Sign LanguageIrish all sign languages all writen languagesfrench sign languageEarly Modern German (1650-1800)multiple languages (currently 55)bilingual Portuguese-EnglishArabic (Algeria)DGS Mexican SpanishAmerican EnglishnlMapudungunItalian English English (UK and US)Dutch (BE and NL) Lithuanian Swedish (and others not used)Old FrenchGeorgianCatalan Sign Language SepediIcelandicLSEGujaratiCebuano Arabic Sign Language (ArSL)Arabic Language (semitic language based on a root-and-pattern structure)G?k?y?TelugudaelEnglish; to-be-extendedNorthern Sotho (Sepedi) nepali sanskrit Old SwedishIgbo constantly expanding range of languagesWideDagbani Anyi EgaAmerican Hungarian Mandarin Chinese Modern Standard Arabic Finnish & EstonianQuranic ArabicViennese varieties PolishJAPE70 languagesEnglish (mainly)Banglamany (with varying coverage) RomanschDreents... All european languagesEnglish (US)MyanmarArabic (Modern Standard variety) Egyptian Cariene Arabic progressively extendedLatvian Maltese Faroese Icelandicmultilingual (over 200)about 200 languagesmultiple languages DholuoLuo Urdu collection in progress FinishHungarian -- but the algorithm used is language independent extendible to othersFinnish Sign LanguageTagalogAuslan (Australian Sign Language)PunjabiItalian DialectsMarathi LSE Spanish Sign Language English-isiZulu English-Sesotho sa LeboaEnglish-AfrikaansBSL signing with English subtitles Ge'ez Tigrigna Armenian Gothic Old Church SlavonicEnglish biomedical domain nlen.fr Oriya Punjabi ?South African Sign LanguageEstonian Welshmultiple language pairs (currently 72) Myanmar ISLABSLEnglish (Medical Terminologies) Moroccan English-FrenchFrench-EnglishFinnishGerman Sign Language (DGS) ?others ZulujaGreek Sign Language Persian The method is applicable to other language pairs. Albanian Russian Sign LanguageJapanese Sign LanguageChinese with a few Siddham SanskritNganasanvenetoItalian<->Englishbilingual English-PortugueseHungarian-SlovenianOfficial Languages of India Gulf Iraqi Levantine Moroccan dialectsEgyptianMalteseLuxembourgish other non-english fr gl pt Portguese Traditional Chinese Esperanto Oshiwambo (Ndonga) Purhépecha Q'echi/KekchiÑahñú (Otomí) Khwedam Ndebele S. SothoXhosaAll eleven official South African languages and others in progress sv Romance Languages ....French (this work)Irish Sign LanguageVenetan Baule Ibibio MedefaidrinAnyiBiscayan (a Basque dialect) Lule Sámi South SámiNorth Sámi englisch mainly lesser-documented languagesmulti-lingualBehavior LanguageEnglish' Oromooindex contains all languages represented in WikipediaNorthern Sotho cs it ruEnglish patent domain Swahili SiGML/HamNoSysSLMongolianMalayMandarin ChineseStandard ArabicBretonAymaraYue ChineseGheg AlbanianStandard ChineseSwiss-GermanMandarinEnglish spoken by JapaneseSwiss-FrenchUkrainianUS EnglishCastillian SpanishBritish EnglishPraatHungarian (hun)English (eng)French (fra)German (deu)Spanish (spa)Swedish (swe)American Sign Language (ase)Turkish (tur)Armenian Azerbaijani American English (eng)Pushto Modern Greek Romansh Kyrgyz (kir)Classical ArabicBurmese EnglshHindi (hin)Old French (fro)Australian EnglishUighur Koine GreekModern Greek (ell)Russian (rus)Icelandic (isl)Ancient GreekMandarin Chinese (cmn)Egyptian ArabicGulf ArabicLevantine ArabicIranian Persian Mesopotamian Arabic prologAmharic (amh)Brazilian Portuguese TamilMSABrazilian Portuguese (por)Italian (ita)Afghanian PersianASLLISRomanian (ron)Kirghiz Contemporary American EnglishSlovenian (slv)Turkmen (tuk)Biblical GreekFaroese Dutch (nld)MocovQuechuaUzbek (uzb)Swiss German Ukrainian (ukr)Welsh Bahasa MelayuAzerbaijani (aze)Basic EnglishSwiss German Sign LanguageEarly New High GermanbulengCatalan (cat)Tajik Cusco Quechua C++FrisianMiddle Dutchother dialects of DutchJapanese (jpn)Estonian (est)srRU17th Century DutchFlemishStandard Dutch mixedIranian Persian (pes)Bosnian Czech (ces)Kazakh American Sign Language (ASL)PalulaDanish (dan)Aragonese Bulgarian (bul)Gheg Albanian (aln)Standard Arabic (arb)Croatian (hrv)Punjabi (pan)Tajik (tgk)Mo PiuVietnameseGujarati (guj)Urdu (urd)Portuguese (por)Old NorseMurrinh-PathaFinnish (fin)Egyptian Arabic (arz)SantomeFaroese (fao)Norwegian Bokml (nob)Belarusian (bel)German (Austria)Flemish DutchKazakh (kaz)Kyrgyz TatarNorwegian Nynorsk (nno)North Levantine Arabic Polish (pol)Telugu (tel)Old CzechEuropean Portuguese (EP)Bengali (ben)Swedish Sign LanguageMoroccan Arabic Scottish EnglishWenzhou dialect in ChinaCilubEnglishLIperl scriptChinese German EnglishGermanJapaneseArabic Italian SpanishBengaliPolish KoreanRussian ChineseCzechManipurijava scriptNot ApplicableHindiEuropean LanguageJava SwedishSwedishFrenchIndonesianSpanishUyghur Arabic frenchEnglish; Chinese; ArabicUrdupersian2600 languages--not independent Hindi Catalan Galician BasquePortuguesemainly English ...Japanese; FrenchKorean others Bangla Telugu Chinese and Arabic Gujarati-HindiHindi-GujaratiBrazillian PortugueseIbanItalian Bengali Marathi Urdu Mandarin and ArabicSimplified ChineseTraditional ChineseDutchJapanese (and Chinese) Portuguese Finnish French and GreekEnglsihEmglish Malay JapaneseMultilingualHebrewEuropean Portuguese HaitianEnglish portionC#; JavaHungarianRomanian lots of. EnglishSouth African ZuluChinese and English isiNdebele isiXhosa isiZuluAfrikaans Burmese LaoThai RomanianTurkishAmerican Sign Language ?otherSwahili Indonesian NorwegianAlbanianBulgarianGreekModern Standard Arabic Bulgarian Byelorussian (Belarussian) Italian Sign Language Russian DutchArabic.English.CzechENSouth African English AfrikaansBasque Spanish and othersSpanish; to-be-extended to EnglishSlovenianCatalanSerbianLatin RDF-XML WSDLEnglish (95%)fr otheramazigh LatvianFilipinoNorwegian Serbian TurkishEuropean Portuguese DialectsGalician Croatian Czech Danish Senegal)Wolof (Niger-Congo Amharic K'iche' QuechuaTigrinya Estonian GreekCroatianSloveneLithuanian ChinseLSF Levantine Arabic PashtoIraqi ArabicMore than one thousand languagesSorani Kurdish English soon TigrinyaAmharicFrench Sign Language (LSF)Sanskrit de en es fiDanishdeesidit22 european union languagesGerman Sign LanguageIrish all sign languages all writen languagesfrench sign languageEarly Modern German (1650-1800)multiple languages (currently 55)bilingual Portuguese-EnglishArabic (Algeria)DGS Mexican SpanishAmerican EnglishnlMapudungunItalian English English (UK and US)Dutch (BE and NL) Lithuanian Swedish (and others not used)Old FrenchGeorgianCatalan Sign Language SepediIcelandicLSEGujaratiCebuano Arabic Sign Language (ArSL)Arabic Language (semitic language based on a root-and-pattern structure)G?k?y?TelugudaelEnglish; to-be-extendedNorthern Sotho (Sepedi) nepali sanskrit Old SwedishIgbo constantly expanding range of languagesWideDagbani Anyi EgaAmerican Hungarian Mandarin Chinese Modern Standard Arabic Finnish & EstonianQuranic ArabicViennese varieties PolishJAPE70 languagesEnglish (mainly)Banglamany (with varying coverage) RomanschDreents... All european languagesEnglish (US)MyanmarArabic (Modern Standard variety) Egyptian Cariene Arabic progressively extendedLatvian Maltese Faroese Icelandicmultilingual (over 200)about 200 languagesmultiple languages DholuoLuo Urdu collection in progress FinishHungarian -- but the algorithm used is language independent extendible to othersFinnish Sign LanguageTagalogAuslan (Australian Sign Language)PunjabiItalian DialectsMarathi LSE Spanish Sign Language English-isiZulu English-Sesotho sa LeboaEnglish-AfrikaansBSL signing with English subtitles Ge'ez Tigrigna Armenian Gothic Old Church SlavonicEnglish biomedical domain nlen.fr Oriya Punjabi ?South African Sign LanguageEstonian Welshmultiple language pairs (currently 72) Myanmar ISLABSLEnglish (Medical Terminologies) Moroccan English-FrenchFrench-EnglishFinnishGerman Sign Language (DGS) ?others ZulujaGreek Sign Language Persian The method is applicable to other language pairs. Albanian Russian Sign LanguageJapanese Sign LanguageChinese with a few Siddham SanskritNganasanvenetoItalian<->Englishbilingual English-PortugueseHungarian-SlovenianOfficial Languages of India Gulf Iraqi Levantine Moroccan dialectsEgyptianMalteseLuxembourgish other non-english fr gl pt Portguese Traditional Chinese Esperanto Oshiwambo (Ndonga) Purhépecha Q'echi/KekchiÑahñú (Otomí) Khwedam Ndebele S. SothoXhosaAll eleven official South African languages and others in progress sv Romance Languages ....French (this work)Irish Sign LanguageVenetan Baule Ibibio MedefaidrinAnyiBiscayan (a Basque dialect) Lule Sámi South SámiNorth Sámi englisch mainly lesser-documented languagesmulti-lingualBehavior LanguageEnglish' Oromooindex contains all languages represented in WikipediaNorthern Sotho cs it ruEnglish patent domain Swahili SiGML/HamNoSysSLMongolianMalayMandarin ChineseStandard ArabicBretonAymaraYue ChineseGheg AlbanianStandard ChineseSwiss-GermanMandarinEnglish spoken by JapaneseSwiss-FrenchUkrainianUS EnglishCastillian SpanishBritish EnglishPraatHungarian (hun)English (eng)French (fra)German (deu)Spanish (spa)Swedish (swe)American Sign Language (ase)Turkish (tur)Armenian Azerbaijani American English (eng)Pushto Modern Greek Romansh Kyrgyz (kir)Classical ArabicBurmese EnglshHindi (hin)Old French (fro)Australian EnglishUighur Koine GreekModern Greek (ell)Russian (rus)Icelandic (isl)Ancient GreekMandarin Chinese (cmn)Egyptian ArabicGulf ArabicLevantine ArabicIranian Persian Mesopotamian Arabic prologAmharic (amh)Brazilian Portuguese TamilMSABrazilian Portuguese (por)Italian (ita)Afghanian PersianASLLISRomanian (ron)Kirghiz Contemporary American EnglishSlovenian (slv)Turkmen (tuk)Biblical GreekFaroese Dutch (nld)MocovQuechuaUzbek (uzb)Swiss German Ukrainian (ukr)Welsh Bahasa MelayuAzerbaijani (aze)Basic EnglishSwiss German Sign LanguageEarly New High GermanbulengCatalan (cat)Tajik Cusco Quechua C++FrisianMiddle Dutchother dialects of DutchJapanese (jpn)Estonian (est)srRU17th Century DutchFlemishStandard Dutch mixedIranian Persian (pes)Bosnian Czech (ces)Kazakh American Sign Language (ASL)PalulaDanish (dan)Aragonese Bulgarian (bul)Gheg Albanian (aln)Standard Arabic (arb)Croatian (hrv)Punjabi (pan)Tajik (tgk)Mo PiuVietnameseGujarati (guj)Urdu (urd)Portuguese (por)Old NorseMurrinh-PathaFinnish (fin)Egyptian Arabic (arz)SantomeFaroese (fao)Norwegian Bokml (nob)Belarusian (bel)German (Austria)Flemish DutchKazakh (kaz)Kyrgyz TatarNorwegian Nynorsk (nno)North Levantine Arabic Polish (pol)Telugu (tel)Old CzechEuropean Portuguese (EP)Bengali (ben)Swedish Sign LanguageMoroccan Arabic Scottish EnglishWenzhou dialect in ChinaCilubMorphological analyzerTool: Machine Translation DecoderWord Alignment Toolkit and Phrase Extraction Toolkitstatistical machine translation systemMarkov Logic EngineNot AssignedMarkov Logic Network frameworkNot AssignedNot AssignedText-to-Speech platformToolkit for building integrated WFST-based models for LVCSRToolkit for recording infant-caregiver dialoguesMultiligual Speech RecognizerToolkit for building WFST-based Grapheme-to-Phoneme and Phoneme-to-Grapheme systems.Syllabifier - Syllabification toolmachine learning classifierXML-based software system for corpora developmentMulti-label classification software plus training data in 22 languagesString Comparison Measurements Librarysemantic concordance (lexicon interlinked with a corpus annotation)LR Acquisition Tooltext corpus annotatedDatabase of propositionsRessource: Associationsa dictionary look-up tool with Katakana variant recognitionLanguage Modeling and Information RetrievalLexical resource, morphological lexiconKeystroke-logging programmorphological engineAlternative phoneme setLSA-Latent Semantic AnalysisWebsite copierR is a free software environment for statistical computing and graphicsNgram generation and Statistical ProcessingConvert HTML to textFile splitterOCRTerm extractorAdditional annotation on an existing treebankTool for lexical resources management and query expansionWeb applicationResource-Tool: Temporal Tagger (NEW VALUE)Repository System (NEW VALUE)Text editor (NEW VALUE)Machine Translationlinear programming solverWord sketches: mid point between corpus and lexiconClustering toolTool for conducting surveys related to the pragmatics of languagesSVM ToolkitMinimally supervised Machine Learning System for Relation ExtractionTransfer rules and dictionaries from rule-based machine translationRule-based machine translation engineStatistical machine translation frameworkCategory HierarchyCorpusEvaluation DataTool: discourse coherence modelAnnotation ToolEvaluation ToolTokenizerTagger/ParserNamed Entity RecogniserLexiconOntologyMorphological Analyzer/GeneratorGrammar/Language ModelMachine Translation ToolNot AssignedRepresentation-Annotation Formalism/GuidelinesSentence ChunkerTranslation Support Web SiteSentence SegmentationNot InsertedInformation Retrieval systemWord Sense DisambiguatorMorphological analysersummarization moduleopen-search engine infrastructureCorpus Query ToolLanguage IdentifierEvaluation PackageAPIs for ontology managementsearchernatural language toolkitkeyphrase extractorclustering toolManipuri StemmerLanguage Modelling ToolkitSMT DecoderChunkerSupport Vector Machine Learning ToolWord Alignment ToolEnglish Morphological AnalyzerMachine Translation toolkitMachine learning softwareTerminologyEvaluation Methodology/Formalism/GuidelinesSignal Processing/Feature ExtractionQuery Loglanguage learning toollanguage model toolkitCRF training toolkitopen source systemLabeling toolWord-aligned corpus coupled with a multi-lingual ontologyStatistical decoderMonte-Carlo based paraphrase generatorsummarizerData Mining ToolJava library for interfacing with WordnetQA datasetmachine learning packageOpenNLP Java Based PackageSurface RealiserSemantic Network CreatorUnsupervised, Language Independent Sentence AlignerBackground Knowledge BaseCoreference ResolutionInformation Extractorrhetorical structure taggerstructure search engine/annotation data repositoryCode: Semantic ParserCode: Semantic Parser / Language GeneratorGrammar customization systemProficiency testing toolJava interface to WikipediaTypological DatabaseTool to restructure FrameNet database and annotationsDemo SoftwareSummarization ToolCrowdsourcing for Document SummarizationRelevant Term extractorStatistical Relational Learning SoftwareTranscriberClassifierProsodic AnalyserText-to-Speech SynthesizerRecording ToolSpeech Recognizer/Transcriber SyllabifierSpeech synthesiserFinite state toolsMorphological AnalyzerFormal grammar for parser generationLanguage databaseNLP FrameworkCorpus analysis softwareText Navigation ToolReference chain identification moduleRepresentation-Annotation Standard/Best PracticeOntology development methodologyOntology Development Environment / ToolStemmerGuidelinesStandardsCourse materialterm extractorterminological databaseSuite of analyzers: tokenizer, morphological, tagger, parser, WSD, coreference, ...finite-state compilerCorpus search toolNLP and Semantic Services Platformmorphological analyzer and generatorlanguage technology infrsatructureTool Search Tool for Linguistic Knowledge DiscoveryResource Tool: Morphological Analyzer (segmenter + POS tagger)Resource Tool: Lexicon AcquirerStatistical machine translationResource toolTransliteratorSpeech RecordingsSearch engineplatform for semantic processing of textSpeaker recogniserframeworkautomatic syllable segmentation toolCorpus Query SystemTool for Acquisition of Verbal Subcategorization Information from CorpusPhrase-based statistical machine translation systemLngugage modeling toolN-gram based statistical machine translation decoderTools for querying an N-gram databaseTools for querying a tagged N-gram databaseConcordancerThe Weka workbench[1] contains a collection of visualization tools and algorithms for data analysis and predictive modellingADN-ClassifierEditing ToolLRT portalStatistical Speech ResourceText BookResource: encyclopedic knowledgeReasonerKnowledge representation toolResource: morphologyan implementation of many classic semantic relatedness approaches based on wordnettools for accessing Knowledge baseWord alignerStatistical Machine Translation toolkitsoftware that is toolkit and resource environment to assist translationCorpus, Tools, Evaluations, etc.Corpus search and handling toolA syntactic judgments databaseText-to-speech synthesis front endPart-of-Speech TagsetAnnotated corpora and annotation specificationsLexical Isolation Point predictorOntology EditorLexical analysis softwareCorpus Format ConverterTerminology management systemFor accessing to a resource datasetCorpora compilation and evaluationlanguage service infrastructureLanguage Modeling ToolLexicon/corpusResource Tool: Treebank Searchcorpus semantically annotated with respect to an ontologyMachine Learning Classification ToolMachine Learning ToolAutomatic summarization toolMetadataMultilingual Research ToolSentiment Analysis ToolCross-language retrieval and machine translation platformTextual Entailment Recognition SystemGrammar checkerControlled Language guidelinesMT engineSpell CorrectorStatsitical language modeling toolSMT toolSentence splitterLemmatizerOntological and lexical resourceTool for transcribing scanned textInteractive-Predictive Handwritten Recognition ToolsdatabaseThesaurusResource-Tool: Coreference Resolutionweb-based authoring tool for NLP-intensive language learning activitiesAdaptation systemSpeech and Handwritten Text RecogniserWikipedia APIUIMA ToolkitConditional Random Fields (CRFs) taggerCross Document Corefernce SystemA suite of NLP toolsOnline EncyclopediaEntity Mention Detection (EMD) toolTool. Word similarity.Corpus interfaceLibrary providing language analysis servicesUrdu Verbs Lexicon BuilderText Mining SystemMachine TranslationParagraph splitting and sentence alignment for parallel corpusResources integrationIPR Acquisition ManualA tool for reconfiguring the FrameNet lexicon and corpusMorphological analyzer and synthesizerSentence alignerCorpus Building and Annotation Tool EvaluationWorkflow Management ToolMachine learning libraryTool for lexical resources management and query expansionTree/Graph Transducer, Development Environment, Text RealizerQuestion Answering systemInformation Retrieval ToolkitEvent SemanticsCALL translation game engine and associated language resourcesSpoken Dialogue ToolkitA list of categories with examples of language useRelation ExtractionSemantic associationsResource-Tool for interaction between existing tools for annotation and exploration of rich linguistic dataLanguage Modeling and Information RetrievalLanguage Processing InfrastructureLearning AlgorithmYahoo!'s local listings in ChicagowebserviceSystem for lexical acquisitionWikipedia Indexation ToolWeb Servicea tool for constructing virtual humansNews ExplorerAcoustic and language modelsSpeech recogniserLanguage modelling toolsearch systemTool for Lexical Semantic Knowledge AcquisitionNode of Language Technology InfrastructureSMT toolkitword-to-word alignerNLP MiddlewareLanguage Service PlatformResource-Tool: Speech synthesistranslation modelCorpus Management Systema software package for Support Vector LearningDependency extraction utilityVirtual Game WorldAnnotation tool, search engineparaphraser, composition tool, authoring aid, MT pre-editor, web serviceparaphraser, composition tool, authoring aid, MT pre-editorcorpora annotation, paraphrasing, translation, resourcesNLP Development ToolA general purpose langyage engineering platformA grammatical inference toolCoreference Resolution SystemMachine Translation DecoderLanguage Modeling Toolkitsentence alignment toolCultural Graph ComparatorLemmatiserText simplification toolTIMEX recognizerTool: question-answer chatbot generated automaticallyStatistical signal processing systemPhone segmentation tooltokenizar, Morphological analyzer, POS taggerA Knowledge Base with Lexical-Semantic Relations between wordsDecoder for phrase based statistical machine translationMorphology analyzerText segmentationEncyclopediaResource-Tool for AcquisitionTagsetNatural Language engineering platformStatistic model implementationData Collection ToolStatistical Machine Translation systemMachine TranslatorSuite of modular Natural Language ProcessingData Collection and Annotation Management SystemWeb Text Collection SystemMachine Translation softwareNLP ArchitectureSet of tools to face orthographic migration problemsparallel grid execution environment for HLT toolstransliteration similarity estimation softwareA Geoparsing engineTool for mapping language resources and usersResource-Tool to map extracted knoweldge to WikipediaOntology Population ToolTool for reconstructing historic Wikipedia snapshotsWeb siteDigital library management systemquery languagePsycholinguistic DatabaseFrequent Itemset Mining ToolAutomatic Speech Recognition toolSpeaking 3D avatarCross-Lingual Information Retrieval toolQuestion Answering toolText-To-Speech toolLinguistic pipeline softwarerepository of bilingual lexiconsData Category Repositorymodel representation of linguistic data comming from several formatsformat descriptionformat description of features for example for linguistic dataLexicon toolsImage AnalyserApplication for Semantic DesktopCorpus-Based Online DictionaryLanguage identifier; Speaker recognizer; Signal processing/feature extraction; Image analyzer; Tokenizer; Evaluation toolClinical NLP PipelineMachine Learning APIOpen source NLP frameworksoftware3D toolkite-learningSynthetic SL performanceMachine Translation SystemData Entry SystemCALL system & related resourcesMultimedia language learning applicationsDictionary Content Management SystemControlled Legal LanguageDependency Parsing Optimization ToolRepresentation-Annotation Standards/Best PracticesSoftware ToolkitCorpus ToolAnaphora resolutionEnvironment for searching syntactic, semantic, psycholinguistic and distributional annotation about the CHILDES English corpora.Text File Merging ToolRepository of NLG task materialsEvaluation Methodology/Standards/GuidelinesSpoken Dialogue ToolNamed Entity RecognizerEye Tracker Acquisition Tool DictionaryInformation Retrieval ToolLogging ToolPhoneme SetTerminology Tool TreebankTemporal TaggerRepository SystemText editorCorpus - Tools - Repository of semantic similarities between wordsPriming coefficientsLanguage Resources/Technologies InfrastructureSpeaker Recognizeropen-word--concept dependency reprezentations and selectional preferencesLexicon ToolCoreference Resolution ToolkitProsodic AnalyzerDataset-Entity EquivalencesMetadata editorLanguage SimilaritiesTextual Entailment ToolConversion tool for constituent-to-dependency conversionSignal Processing/Features ExtractionDatabase schemaConverterData Mining toolkitDuplicate FinderSpell CheckerSpeech SegmentationNormalization tool Morphological Analyzer/GeneratorMachine Translation ToolMachine Translation ToolStatistical Relational Learning SoftwareTagger/ParserStatistical Relational Learning SoftwareCorpusClassifierText-to-Speech SynthesizerRecording ToolSpeech Recognizer/Transcriber Speech Recognizer/TranscriberSyllabifierMachine Learning ToolSoftware ToolkitSoftware ToolkitSoftware Toolkit LexiconAcquisition Tool CorpusDatabaseDatabaseDictionaryInformation Retrieval ToolLexiconLogging ToolMorphological Analyzer/GeneratorPhoneme SetSoftware ToolkitSoftware ToolkitSoftware ToolkitSoftware ToolkitSoftware ToolkitSoftware ToolkitSoftware Toolkit Terminology Tool TreebankWeb ServiceWeb Service Temporal TaggerRepository SystemText editorMachine Translation ToolSoftware ToolkitLexiconSoftware ToolkitWeb ServiceMachine Learning Tool Machine Learning Tool DatabaseMachine Translation ToolMachine Translation ToolOntologyCorpusEvaluation DataTool: discourse coherence modelAnnotation ToolEvaluation ToolTokenizerTagger/ParserNamed Entity RecogniserLexiconOntologyMorphological Analyzer/GeneratorGrammar/Language ModelMachine Translation ToolNot AssignedRepresentation-Annotation Formalism/GuidelinesSentence ChunkerTranslation Support Web SiteSentence SegmentationNot InsertedInformation Retrieval systemWord Sense DisambiguatorMorphological analysersummarization moduleopen-search engine infrastructureCorpus Query ToolLanguage IdentifierEvaluation PackageAPIs for ontology managementsearchernatural language toolkitkeyphrase extractorclustering toolManipuri StemmerLanguage Modelling ToolkitSMT DecoderChunkerSupport Vector Machine Learning ToolWord Alignment ToolEnglish Morphological AnalyzerMachine Translation toolkitMachine learning softwareTerminologyEvaluation Methodology/Formalism/GuidelinesSignal Processing/Feature ExtractionQuery Loglanguage learning toollanguage model toolkitCRF training toolkitopen source systemLabeling toolWord-aligned corpus coupled with a multi-lingual ontologyStatistical decoderMonte-Carlo based paraphrase generatorsummarizerData Mining ToolJava library for interfacing with WordnetQA datasetmachine learning packageOpenNLP Java Based PackageSurface RealiserSemantic Network CreatorUnsupervised, Language Independent Sentence AlignerBackground Knowledge BaseCoreference ResolutionInformation Extractorrhetorical structure taggerstructure search engine/annotation data repositoryCode: Semantic ParserCode: Semantic Parser / Language GeneratorGrammar customization systemProficiency testing toolJava interface to WikipediaTypological DatabaseTool to restructure FrameNet database and annotationsDemo SoftwareSummarization ToolCrowdsourcing for Document SummarizationRelevant Term extractorStatistical Relational Learning SoftwareTranscriberClassifierProsodic AnalyserText-to-Speech SynthesizerRecording ToolSpeech Recognizer/Transcriber SyllabifierSpeech synthesiserFinite state toolsMorphological AnalyzerFormal grammar for parser generationLanguage databaseNLP FrameworkCorpus analysis softwareText Navigation ToolReference chain identification moduleRepresentation-Annotation Standard/Best PracticeOntology development methodologyOntology Development Environment / ToolStemmerGuidelinesStandardsCourse materialterm extractorterminological databaseSuite of analyzers: tokenizer, morphological, tagger, parser, WSD, coreference, ...finite-state compilerCorpus search toolNLP and Semantic Services Platformmorphological analyzer and generatorlanguage technology infrsatructureTool Search Tool for Linguistic Knowledge DiscoveryResource Tool: Morphological Analyzer (segmenter + POS tagger)Resource Tool: Lexicon AcquirerStatistical machine translationResource toolTransliteratorSpeech RecordingsSearch engineplatform for semantic processing of textSpeaker recogniserframeworkautomatic syllable segmentation toolCorpus Query SystemTool for Acquisition of Verbal Subcategorization Information from CorpusPhrase-based statistical machine translation systemLngugage modeling toolN-gram based statistical machine translation decoderTools for querying an N-gram databaseTools for querying a tagged N-gram databaseConcordancerThe Weka workbench[1] contains a collection of visualization tools and algorithms for data analysis and predictive modellingADN-ClassifierEditing ToolLRT portalStatistical Speech ResourceText BookResource: encyclopedic knowledgeReasonerKnowledge representation toolResource: morphologyan implementation of many classic semantic relatedness approaches based on wordnettools for accessing Knowledge baseWord alignerStatistical Machine Translation toolkitsoftware that is toolkit and resource environment to assist translationCorpus, Tools, Evaluations, etc.Corpus search and handling toolA syntactic judgments databaseText-to-speech synthesis front endPart-of-Speech TagsetAnnotated corpora and annotation specificationsLexical Isolation Point predictorOntology EditorLexical analysis softwareCorpus Format ConverterTerminology management systemFor accessing to a resource datasetCorpora compilation and evaluationlanguage service infrastructureLanguage Modeling ToolLexicon/corpusResource Tool: Treebank Searchcorpus semantically annotated with respect to an ontologyMachine Learning Classification ToolMachine Learning ToolAutomatic summarization toolMetadataMultilingual Research ToolSentiment Analysis ToolCross-language retrieval and machine translation platformTextual Entailment Recognition SystemGrammar checkerControlled Language guidelinesMT engineSpell CorrectorStatsitical language modeling toolSMT toolSentence splitterLemmatizerOntological and lexical resourceTool for transcribing scanned textInteractive-Predictive Handwritten Recognition ToolsdatabaseThesaurusResource-Tool: Coreference Resolutionweb-based authoring tool for NLP-intensive language learning activitiesAdaptation systemSpeech and Handwritten Text RecogniserWikipedia APIUIMA ToolkitConditional Random Fields (CRFs) taggerCross Document Corefernce SystemA suite of NLP toolsOnline EncyclopediaEntity Mention Detection (EMD) toolTool. Word similarity.Corpus interfaceLibrary providing language analysis servicesUrdu Verbs Lexicon BuilderText Mining SystemMachine TranslationParagraph splitting and sentence alignment for parallel corpusResources integrationIPR Acquisition ManualA tool for reconfiguring the FrameNet lexicon and corpusMorphological analyzer and synthesizerSentence alignerCorpus Building and Annotation Tool EvaluationWorkflow Management ToolMachine learning libraryTool for lexical resources management and query expansionTree/Graph Transducer, Development Environment, Text RealizerQuestion Answering systemInformation Retrieval ToolkitEvent SemanticsCALL translation game engine and associated language resourcesSpoken Dialogue ToolkitA list of categories with examples of language useRelation ExtractionSemantic associationsResource-Tool for interaction between existing tools for annotation and exploration of rich linguistic dataLanguage Modeling and Information RetrievalLanguage Processing InfrastructureLearning AlgorithmYahoo!'s local listings in ChicagowebserviceSystem for lexical acquisitionWikipedia Indexation ToolWeb Servicea tool for constructing virtual humansNews ExplorerAcoustic and language modelsSpeech recogniserLanguage modelling toolsearch systemTool for Lexical Semantic Knowledge AcquisitionNode of Language Technology InfrastructureSMT toolkitword-to-word alignerNLP MiddlewareLanguage Service PlatformResource-Tool: Speech synthesistranslation modelCorpus Management Systema software package for Support Vector LearningDependency extraction utilityVirtual Game WorldAnnotation tool, search engineparaphraser, composition tool, authoring aid, MT pre-editor, web serviceparaphraser, composition tool, authoring aid, MT pre-editorcorpora annotation, paraphrasing, translation, resourcesNLP Development ToolA general purpose langyage engineering platformA grammatical inference toolCoreference Resolution SystemMachine Translation DecoderLanguage Modeling Toolkitsentence alignment toolCultural Graph ComparatorLemmatiserText simplification toolTIMEX recognizerTool: question-answer chatbot generated automaticallyStatistical signal processing systemPhone segmentation tooltokenizar, Morphological analyzer, POS taggerA Knowledge Base with Lexical-Semantic Relations between wordsDecoder for phrase based statistical machine translationMorphology analyzerText segmentationEncyclopediaResource-Tool for AcquisitionTagsetNatural Language engineering platformStatistic model implementationData Collection ToolStatistical Machine Translation systemMachine TranslatorSuite of modular Natural Language ProcessingData Collection and Annotation Management SystemWeb Text Collection SystemMachine Translation softwareNLP ArchitectureSet of tools to face orthographic migration problemsparallel grid execution environment for HLT toolstransliteration similarity estimation softwareA Geoparsing engineTool for mapping language resources and usersResource-Tool to map extracted knoweldge to WikipediaOntology Population ToolTool for reconstructing historic Wikipedia snapshotsWeb siteDigital library management systemquery languagePsycholinguistic DatabaseFrequent Itemset Mining ToolAutomatic Speech Recognition toolSpeaking 3D avatarCross-Lingual Information Retrieval toolQuestion Answering toolText-To-Speech toolLinguistic pipeline softwarerepository of bilingual lexiconsData Category Repositorymodel representation of linguistic data comming from several formatsformat descriptionformat description of features for example for linguistic dataLexicon toolsImage AnalyserApplication for Semantic DesktopCorpus-Based Online DictionaryLanguage identifier; Speaker recognizer; Signal processing/feature extraction; Image analyzer; Tokenizer; Evaluation toolClinical NLP PipelineMachine Learning APIOpen source NLP frameworksoftware3D toolkite-learningSynthetic SL performanceMachine Translation SystemData Entry SystemCALL system & related resourcesMultimedia language learning applicationsDictionary Content Management SystemControlled Legal LanguageDependency Parsing Optimization ToolRepresentation-Annotation Standards/Best PracticesSoftware ToolkitCorpus ToolAnaphora resolutionEnvironment for searching syntactic, semantic, psycholinguistic and distributional annotation about the CHILDES English corpora.Text File Merging ToolRepository of NLG task materialsEvaluation Methodology/Standards/GuidelinesSpoken Dialogue ToolNamed Entity RecognizerEye Tracker Acquisition Tool DictionaryInformation Retrieval ToolLogging ToolPhoneme SetTerminology Tool TreebankTemporal TaggerRepository SystemText editorCorpus - Tools - Repository of semantic similarities between wordsPriming coefficientsLanguage Resources/Technologies InfrastructureSpeaker Recognizeropen-word--concept dependency reprezentations and selectional preferencesLexicon ToolCoreference Resolution ToolkitProsodic AnalyzerDataset-Entity EquivalencesMetadata editorLanguage SimilaritiesTextual Entailment ToolConversion tool for constituent-to-dependency conversionSignal Processing/Features ExtractionDatabase schemaConverterData Mining toolkitDuplicate FinderSpell CheckerSpeech SegmentationNormalization tool Existing-usedNewly created-finishedNewly created-in progressExisting-updatedNot ApplicableWe create the training resources for French, TTL is already available for English and Romanian (Ion, 2007). This is a joint work with RACAI (Romanian Academy) to obtain a tagger producing similar output for French, English and Romanian.evaluation in progressExisting-in progresscreated, continuously updatedExisting - in progressExisting (Preliminary state), Proposing evaluation and linguistic extensioncreated and in progress; portions currently being used in sponsored projectBeta Version of Game TwinitySome are newly created-in progress, some are existing-updatedWeb-based corpus, automatically collected.Existing - used, but awaiting further enhancementInitial version completedDIS: draft international standard of ISOit originates from Semantic Engine of Knowledge Management platform Semantic Turkey. Server is stable and recently usable as described in the paper, though currently distributed in its original form (anyway, can already be used as a Semantic Desktop serveexisting and used; still work in progressplanningselection of French-Romanian corpus5422 sentences at last stable version. An additional 4047 sentences in ongoing version.Merged and enhanced version of two existing resources, Morfeusz SGJP and Morfologikto be released with this publicationAnnotated version of an existing raw corpus (lemmatized, clustered, parsed version of the Est Republicain)Existing-usedNewly created-finishedNewly created-in progressExisting-updatedNot ApplicableWe create the training resources for French, TTL is already available for English and Romanian (Ion, 2007). This is a joint work with RACAI (Romanian Academy) to obtain a tagger producing similar output for French, English and Romanian.evaluation in progressExisting-in progresscreated, continuously updatedExisting - in progressExisting (Preliminary state), Proposing evaluation and linguistic extensioncreated and in progress; portions currently being used in sponsored projectBeta Version of Game TwinitySome are newly created-in progress, some are existing-updatedWeb-based corpus, automatically collected.Existing - used, but awaiting further enhancementInitial version completedDIS: draft international standard of ISOit originates from Semantic Engine of Knowledge Management platform Semantic Turkey. Server is stable and recently usable as described in the paper, though currently distributed in its original form (anyway, can already be used as a Semantic Desktop serveexisting and used; still work in progressplanningselection of French-Romanian corpus5422 sentences at last stable version. An additional 4047 sentences in ongoing version.Merged and enhanced version of two existing resources, Morfeusz SGJP and Morfologikto be released with this publicationAnnotated version of an existing raw corpus (lemmatized, clustered, parsed version of the Est Republicain)From OwnerNot AvailableFrom Data Center(s)Freely AvailableFrom OwnerNot AvailableFree on request from authorsNot ApplicableIn progress. Please contact lead authorshould be available later onfree browsingFreely available for academic useFrom Data Center(s)Freely AvailableFrom OwnerNot AvailableFree on request from authorsNot ApplicableIn progress. Please contact lead authorshould be available later onfree browsingFreely available for academic useFrom Data Center(s)Freely AvailableFrom OwnerNot AvailableFree on request from authorsNot ApplicableIn progress. Please contact lead authorshould be available later onfree browsingFreely available for academic useFrom Data Center(s)Freely AvailableFrom OwnerNot AvailableFree on request from authorsNot ApplicableIn progress. Please contact lead authorshould be available later onfree browsingFreely available for academic useFrom Data Center(s)Freely AvailableFrom OwnerNot AvailableFree on request from authorsNot ApplicableIn progress. Please contact lead authorshould be available later onfree browsingFreely available for academic useFrom Data Center(s)Freely AvailableFrom OwnerNot AvailableFree on request from authorsNot ApplicableIn progress. Please contact lead authorshould be available later onfree browsingFreely available for academic useFrom Data Center(s)Freely AvailableFrom OwnerNot AvailableFree on request from authorsNot ApplicableIn progress. Please contact lead authorshould be available later onfree browsingFreely available for academic useFrom Data Center(s)Freely AvailableFrom OwnerNot AvailableFree on request from authorsNot ApplicableIn progress. Please contact lead authorshould be available later onfree browsingFreely available for academic useFrom Data Center(s)Freely AvailableFrom OwnerWill be published in the futureNot ApplicableNot AvailableFrom maker (Prof. Amba Kulkarni)soon available - legal clearance in processFrom Maker (Prof. Peter Scharf)From Data Center(s)Freely AvailableFrom OwnerWill be published in the futureNot ApplicableNot AvailableFrom maker (Prof. Amba Kulkarni)soon available - legal clearance in processFrom Maker (Prof. Peter Scharf)Free on request from authorsIn progress. Please contact lead authorshould be available later onfree browsingFreely available for academic useFrom Data Center(s)Freely AvailableFrom OwnerWill be published in the futureNot ApplicableNot AvailableFrom maker (Prof. Amba Kulkarni)soon available - legal clearance in processFrom Maker (Prof. Peter Scharf)Free on request from authorsIn progress. Please contact lead authorshould be available later onfree browsingFreely available for academic useavailable at publicationFreely AvalableFrom OwnerOnlineFrom Data Center(s)Not ApplicableNot AvailableNot available at this timenot yet available1M subcorpus freely availablewill be freely available once the user has the license for the original treebankThere are plans to make a stable version freely available (open source).concordances free, full texts in negotiationFree for research purposesto be released at end of projectpublicsee http://lemurproject.org/clueweb09/Can be purchased from ISOWill be available as soon as possible, hopefully before the conference datePublished as a book and DVDFrom Owener starting mid of 2012not finalized yetdifferent availability statuses, in the longer perspective available under open licensesFreely Available (to be confirmed)ELRA and John Benjamins (Book + DVD)As yet undetermined.Freely available for research purpose (see license)online access/parts are freely available from IICTfreely available for search, license needed for downloadthrough LDCStill under developmenton-line platformfree for academic purposesnot publicly availableAccessible via web interfaceFrom two project sitesfrom owner - for research purposesFrom PublisherThis data is released under a slightly modified Creative Commons Attribution Share-Alike license. One must register as a participant in Blizzard Challenge 2012Unavailable due to privacy issuesFreely available for academic useOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/mtapplication.html)to be checkedOnly dependency information availableto followNot yet decidedtbdin preparationAvailable after project completionfreely available, but subject to requester having obtained the license for the original TIGER treebankWill be made available sooncan be crawled, corpus is not published due to copyright issuesFreely available to those holding a license for the source datasetfor participants in the evaluation campaignOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/dictionaryapplication.html)Corpus interface tool currently under developmentFree for academic usersSearchable, subcorpus available on GNU GPL.Freely available, but subject to having obtained Penn Treebank from LDCavailable to the US governmentavailable at publicationFreely AvalableFrom OwnerOnlineFrom Data Center(s)Not ApplicableNot AvailableNot available at this timenot yet available1M subcorpus freely availablewill be freely available once the user has the license for the original treebankThere are plans to make a stable version freely available (open source).concordances free, full texts in negotiationFree for research purposesto be released at end of projectpublicsee http://lemurproject.org/clueweb09/Can be purchased from ISOWill be available as soon as possible, hopefully before the conference datePublished as a book and DVDFrom Owener starting mid of 2012not finalized yetdifferent availability statuses, in the longer perspective available under open licensesFreely Available (to be confirmed)ELRA and John Benjamins (Book + DVD)As yet undetermined.Freely available for research purpose (see license)online access/parts are freely available from IICTfreely available for search, license needed for downloadthrough LDCStill under developmenton-line platformfree for academic purposesnot publicly availableAccessible via web interfaceFrom two project sitesfrom owner - for research purposesFrom PublisherThis data is released under a slightly modified Creative Commons Attribution Share-Alike license. One must register as a participant in Blizzard Challenge 2012Unavailable due to privacy issuesFreely available for academic useOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/mtapplication.html)to be checkedOnly dependency information availableto followNot yet decidedtbdin preparationAvailable after project completionfreely available, but subject to requester having obtained the license for the original TIGER treebankWill be made available sooncan be crawled, corpus is not published due to copyright issuesFreely available to those holding a license for the source datasetfor participants in the evaluation campaignOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/dictionaryapplication.html)Corpus interface tool currently under developmentFree for academic usersSearchable, subcorpus available on GNU GPL.Freely available, but subject to having obtained Penn Treebank from LDCavailable to the US governmentavailable at publicationFreely AvalableFrom OwnerOnlineFrom Data Center(s)Not ApplicableNot AvailableNot available at this timenot yet available1M subcorpus freely availablewill be freely available once the user has the license for the original treebankThere are plans to make a stable version freely available (open source).concordances free, full texts in negotiationFree for research purposesto be released at end of projectpublicsee http://lemurproject.org/clueweb09/Can be purchased from ISOWill be available as soon as possible, hopefully before the conference datePublished as a book and DVDFrom Owener starting mid of 2012not finalized yetdifferent availability statuses, in the longer perspective available under open licensesFreely Available (to be confirmed)ELRA and John Benjamins (Book + DVD)As yet undetermined.Freely available for research purpose (see license)online access/parts are freely available from IICTfreely available for search, license needed for downloadthrough LDCStill under developmenton-line platformfree for academic purposesnot publicly availableAccessible via web interfaceFrom two project sitesfrom owner - for research purposesFrom PublisherThis data is released under a slightly modified Creative Commons Attribution Share-Alike license. One must register as a participant in Blizzard Challenge 2012Unavailable due to privacy issuesFreely available for academic useOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/mtapplication.html)to be checkedOnly dependency information availableto followNot yet decidedtbdin preparationAvailable after project completionfreely available, but subject to requester having obtained the license for the original TIGER treebankWill be made available sooncan be crawled, corpus is not published due to copyright issuesFreely available to those holding a license for the source datasetfor participants in the evaluation campaignOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/dictionaryapplication.html)Corpus interface tool currently under developmentFree for academic usersSearchable, subcorpus available on GNU GPL.Freely available, but subject to having obtained Penn Treebank from LDCavailable to the US governmentavailable at publicationFreely AvalableFrom OwnerOnlineFrom Data Center(s)Not ApplicableNot AvailableNot available at this timenot yet available1M subcorpus freely availablewill be freely available once the user has the license for the original treebankThere are plans to make a stable version freely available (open source).concordances free, full texts in negotiationFree for research purposesto be released at end of projectpublicsee http://lemurproject.org/clueweb09/Can be purchased from ISOWill be available as soon as possible, hopefully before the conference datePublished as a book and DVDFrom Owener starting mid of 2012not finalized yetdifferent availability statuses, in the longer perspective available under open licensesFreely Available (to be confirmed)ELRA and John Benjamins (Book + DVD)As yet undetermined.Freely available for research purpose (see license)online access/parts are freely available from IICTfreely available for search, license needed for downloadthrough LDCStill under developmenton-line platformfree for academic purposesnot publicly availableAccessible via web interfaceFrom two project sitesfrom owner - for research purposesFrom PublisherThis data is released under a slightly modified Creative Commons Attribution Share-Alike license. One must register as a participant in Blizzard Challenge 2012Unavailable due to privacy issuesFreely available for academic useOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/mtapplication.html)to be checkedOnly dependency information availableto followNot yet decidedtbdin preparationAvailable after project completionfreely available, but subject to requester having obtained the license for the original TIGER treebankWill be made available sooncan be crawled, corpus is not published due to copyright issuesFreely available to those holding a license for the source datasetfor participants in the evaluation campaignOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/dictionaryapplication.html)Corpus interface tool currently under developmentFree for academic usersSearchable, subcorpus available on GNU GPL.Freely available, but subject to having obtained Penn Treebank from LDCavailable to the US governmentavailable at publicationFreely AvalableFrom OwnerOnlineFrom Data Center(s)Not ApplicableNot AvailableNot available at this timenot yet available1M subcorpus freely availablewill be freely available once the user has the license for the original treebankThere are plans to make a stable version freely available (open source).concordances free, full texts in negotiationFree for research purposesto be released at end of projectpublicsee http://lemurproject.org/clueweb09/Can be purchased from ISOWill be available as soon as possible, hopefully before the conference datePublished as a book and DVDFrom Owener starting mid of 2012not finalized yetdifferent availability statuses, in the longer perspective available under open licensesFreely Available (to be confirmed)ELRA and John Benjamins (Book + DVD)As yet undetermined.Freely available for research purpose (see license)online access/parts are freely available from IICTfreely available for search, license needed for downloadthrough LDCStill under developmenton-line platformfree for academic purposesnot publicly availableAccessible via web interfaceFrom two project sitesfrom owner - for research purposesFrom PublisherThis data is released under a slightly modified Creative Commons Attribution Share-Alike license. One must register as a participant in Blizzard Challenge 2012Unavailable due to privacy issuesFreely available for academic useOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/mtapplication.html)to be checkedOnly dependency information availableto followNot yet decidedtbdin preparationAvailable after project completionfreely available, but subject to requester having obtained the license for the original TIGER treebankWill be made available sooncan be crawled, corpus is not published due to copyright issuesFreely available to those holding a license for the source datasetfor participants in the evaluation campaignOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/dictionaryapplication.html)Corpus interface tool currently under developmentFree for academic usersSearchable, subcorpus available on GNU GPL.Freely available, but subject to having obtained Penn Treebank from LDCavailable to the US governmentFreely AvailableFrom Data Center(s)need Mainichi Shinbun '95 corpusFrom OwnerDUCNot AvailableFreely AvailableFrom Data Center(s)From OwnerAvailable in the future through the ETAPE projectThe Speech ArkLinguistic Data Consortiumavailable for scientific purposes, but distribusion is limitedFreely available in Summer, 2011Not AvailableFrom OwnerNot AvailableFreely AvailableFrom Data Center(s)need Mainichi Shinbun '95 corpusFrom OwnerDUCNot AvailableWill be published in the futureNot ApplicableFrom maker (Prof. Amba Kulkarni)soon available - legal clearance in processFrom Maker (Prof. Peter Scharf)Available in the future through the ETAPE projectThe Speech ArkLinguistic Data Consortiumavailable for scientific purposes, but distribusion is limitedFreely available in Summer, 2011available at publicationFreely AvalableOnlineNot available at this timenot yet available1M subcorpus freely availablewill be freely available once the user has the license for the original treebankThere are plans to make a stable version freely available (open source).concordances free, full texts in negotiationFree for research purposesto be released at end of projectpublicsee http://lemurproject.org/clueweb09/Can be purchased from ISOWill be available as soon as possible, hopefully before the conference datePublished as a book and DVDFrom Owener starting mid of 2012not finalized yetdifferent availability statuses, in the longer perspective available under open licensesFreely Available (to be confirmed)ELRA and John Benjamins (Book + DVD)As yet undetermined.Freely available for research purpose (see license)online access/parts are freely available from IICTfreely available for search, license needed for downloadthrough LDCStill under developmenton-line platformfree for academic purposesnot publicly availableAccessible via web interfaceFrom two project sitesfrom owner - for research purposesFrom PublisherThis data is released under a slightly modified Creative Commons Attribution Share-Alike license. One must register as a participant in Blizzard Challenge 2012Unavailable due to privacy issuesFreely available for academic useOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/mtapplication.html)to be checkedOnly dependency information availableto followNot yet decidedtbdin preparationAvailable after project completionfreely available, but subject to requester having obtained the license for the original TIGER treebankWill be made available sooncan be crawled, corpus is not published due to copyright issuesFreely available to those holding a license for the source datasetfor participants in the evaluation campaignOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/dictionaryapplication.html)Corpus interface tool currently under developmentFree for academic usersSearchable, subcorpus available on GNU GPL.Freely available, but subject to having obtained Penn Treebank from LDCavailable to the US governmentIt will be freely available as a web application once completely createdAvailable PDF files (http://nl.ijs.si/isjt10/). However the corpus itself is at this moment not available.Free on request from authorsIn progress. Please contact lead authorshould be available later onfree browsingFreely AvailableFrom Data Center(s)need Mainichi Shinbun '95 corpusFrom OwnerDUCNot AvailableWill be published in the futureNot ApplicableFrom maker (Prof. Amba Kulkarni)soon available - legal clearance in processFrom Maker (Prof. Peter Scharf)Available in the future through the ETAPE projectThe Speech ArkLinguistic Data Consortiumavailable for scientific purposes, but distribusion is limitedFreely available in Summer, 2011available at publicationFreely AvalableOnlineNot available at this timenot yet available1M subcorpus freely availablewill be freely available once the user has the license for the original treebankThere are plans to make a stable version freely available (open source).concordances free, full texts in negotiationFree for research purposesto be released at end of projectpublicsee http://lemurproject.org/clueweb09/Can be purchased from ISOWill be available as soon as possible, hopefully before the conference datePublished as a book and DVDFrom Owener starting mid of 2012not finalized yetdifferent availability statuses, in the longer perspective available under open licensesFreely Available (to be confirmed)ELRA and John Benjamins (Book + DVD)As yet undetermined.Freely available for research purpose (see license)online access/parts are freely available from IICTfreely available for search, license needed for downloadthrough LDCStill under developmenton-line platformfree for academic purposesnot publicly availableAccessible via web interfaceFrom two project sitesfrom owner - for research purposesFrom PublisherThis data is released under a slightly modified Creative Commons Attribution Share-Alike license. One must register as a participant in Blizzard Challenge 2012Unavailable due to privacy issuesFreely available for academic useOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/mtapplication.html)to be checkedOnly dependency information availableto followNot yet decidedtbdin preparationAvailable after project completionfreely available, but subject to requester having obtained the license for the original TIGER treebankWill be made available sooncan be crawled, corpus is not published due to copyright issuesFreely available to those holding a license for the source datasetfor participants in the evaluation campaignOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/dictionaryapplication.html)Corpus interface tool currently under developmentFree for academic usersSearchable, subcorpus available on GNU GPL.Freely available, but subject to having obtained Penn Treebank from LDCavailable to the US governmentIt will be freely available as a web application once completely createdAvailable PDF files (http://nl.ijs.si/isjt10/). However the corpus itself is at this moment not available.Free on request from authorsIn progress. Please contact lead authorshould be available later onfree browsingFreely AvailableFrom Data Center(s)need Mainichi Shinbun '95 corpusFrom OwnerDUCNot AvailableWill be published in the futureNot ApplicableFrom maker (Prof. Amba Kulkarni)soon available - legal clearance in processFrom Maker (Prof. Peter Scharf)Available in the future through the ETAPE projectThe Speech ArkLinguistic Data Consortiumavailable for scientific purposes, but distribusion is limitedFreely available in Summer, 2011available at publicationFreely AvalableOnlineNot available at this timenot yet available1M subcorpus freely availablewill be freely available once the user has the license for the original treebankThere are plans to make a stable version freely available (open source).concordances free, full texts in negotiationFree for research purposesto be released at end of projectpublicsee http://lemurproject.org/clueweb09/Can be purchased from ISOWill be available as soon as possible, hopefully before the conference datePublished as a book and DVDFrom Owener starting mid of 2012not finalized yetdifferent availability statuses, in the longer perspective available under open licensesFreely Available (to be confirmed)ELRA and John Benjamins (Book + DVD)As yet undetermined.Freely available for research purpose (see license)online access/parts are freely available from IICTfreely available for search, license needed for downloadthrough LDCStill under developmenton-line platformfree for academic purposesnot publicly availableAccessible via web interfaceFrom two project sitesfrom owner - for research purposesFrom PublisherThis data is released under a slightly modified Creative Commons Attribution Share-Alike license. One must register as a participant in Blizzard Challenge 2012Unavailable due to privacy issuesFreely available for academic useOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/mtapplication.html)to be checkedOnly dependency information availableto followNot yet decidedtbdin preparationAvailable after project completionfreely available, but subject to requester having obtained the license for the original TIGER treebankWill be made available sooncan be crawled, corpus is not published due to copyright issuesFreely available to those holding a license for the source datasetfor participants in the evaluation campaignOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/dictionaryapplication.html)Corpus interface tool currently under developmentFree for academic usersSearchable, subcorpus available on GNU GPL.Freely available, but subject to having obtained Penn Treebank from LDCavailable to the US governmentIt will be freely available as a web application once completely createdAvailable PDF files (http://nl.ijs.si/isjt10/). However the corpus itself is at this moment not available.Free on request from authorsIn progress. Please contact lead authorshould be available later onfree browsingFreely AvailableFree Available and our own corpusFrom OwnerTo be determined (in the next few months)From Data Center(s)Will be made available onlineNot ApplicableNot AvailableNTCIR evaluation campaign organizersPartly openSoon to be freely availablewill be freely availableLDCparticipating teamThe original corpus is available from the owner.CD attached with a bookTo be freely-availableMay become available in the future - please monitor website.Will be available, shortlyCopyrighted by ownersIWSLT evaluationSIGHAN-3,4 competitionFreely available for academic useWithin our organizationAvailable from i2b2, need to sign a DUAWill be available in personal web siteFreely AvailableFree Available and our own corpusFrom OwnerTo be determined (in the next few months)From Data Center(s)Will be made available onlineNot ApplicableNot AvailableNTCIR evaluation campaign organizersPartly openSoon to be freely availablewill be freely availableLDCparticipating teamThe original corpus is available from the owner.CD attached with a bookTo be freely-availableMay become available in the future - please monitor website.Will be available, shortlyCopyrighted by ownersIWSLT evaluationSIGHAN-3,4 competitionFreely available for academic useWithin our organizationAvailable from i2b2, need to sign a DUAWill be available in personal web siteFreely AvailableFree Available and our own corpusFrom OwnerTo be determined (in the next few months)From Data Center(s)Will be made available onlineNot ApplicableNot AvailableNTCIR evaluation campaign organizersPartly openSoon to be freely availablewill be freely availableLDCparticipating teamThe original corpus is available from the owner.CD attached with a bookTo be freely-availableMay become available in the future - please monitor website. PolishWill be available, shortlyCopyrighted by ownersIWSLT evaluationSIGHAN-3,4 competitionSwedishFreely available for academic useWithin our organizationAvailable from i2b2, need to sign a DUAWill be available in personal web site Gaelic Greek and Finnish Sinhalapt Arabic de Greek English Basque Telugu Xitsonga Sesotho csFreely AvailableFrom Data Center(s)need Mainichi Shinbun '95 corpusFrom OwnerDUCNot AvailableFree Available and our own corpusTo be determined (in the next few months)Will be made available onlineNot ApplicableNTCIR evaluation campaign organizersPartly openSoon to be freely availablewill be freely availableLDCparticipating teamThe original corpus is available from the owner.CD attached with a bookTo be freely-availableMay become available in the future - please monitor website.Will be available, shortlyCopyrighted by ownersIWSLT evaluationSIGHAN-3,4 competitionFreely available for academic useWithin our organizationAvailable from i2b2, need to sign a DUAWill be available in personal web siteWill be published in the futureFrom maker (Prof. Amba Kulkarni)soon available - legal clearance in processFrom Maker (Prof. Peter Scharf)Available in the future through the ETAPE projectThe Speech ArkLinguistic Data Consortiumavailable for scientific purposes, but distribusion is limitedFreely available in Summer, 2011 PolishSwedish Gaelic Greek and Finnish Sinhalapt Arabic de Greek English Basque Telugu Xitsonga Sesotho csavailable at publicationFreely AvalableOnlineNot available at this timenot yet available1M subcorpus freely availablewill be freely available once the user has the license for the original treebankThere are plans to make a stable version freely available (open source).concordances free, full texts in negotiationFree for research purposesto be released at end of projectpublicsee http://lemurproject.org/clueweb09/Can be purchased from ISOWill be available as soon as possible, hopefully before the conference datePublished as a book and DVDFrom Owener starting mid of 2012not finalized yetdifferent availability statuses, in the longer perspective available under open licensesFreely Available (to be confirmed)ELRA and John Benjamins (Book + DVD)As yet undetermined.Freely available for research purpose (see license)online access/parts are freely available from IICTfreely available for search, license needed for downloadthrough LDCStill under developmenton-line platformfree for academic purposesnot publicly availableAccessible via web interfaceFrom two project sitesfrom owner - for research purposesFrom PublisherThis data is released under a slightly modified Creative Commons Attribution Share-Alike license. One must register as a participant in Blizzard Challenge 2012Unavailable due to privacy issuesOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/mtapplication.html)to be checkedOnly dependency information availableto followNot yet decidedtbdin preparationAvailable after project completionfreely available, but subject to requester having obtained the license for the original TIGER treebankWill be made available sooncan be crawled, corpus is not published due to copyright issuesFreely available to those holding a license for the source datasetfor participants in the evaluation campaignOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/dictionaryapplication.html)Corpus interface tool currently under developmentFree for academic usersSearchable, subcorpus available on GNU GPL.Freely available, but subject to having obtained Penn Treebank from LDCavailable to the US governmentIt will be freely available as a web application once completely createdAvailable PDF files (http://nl.ijs.si/isjt10/). However the corpus itself is at this moment not available.Free on request from authorsIn progress. Please contact lead authorshould be available later onfree browsingFreely AvailableFrom Data Center(s)need Mainichi Shinbun '95 corpusFrom OwnerDUCNot AvailableFree Available and our own corpusTo be determined (in the next few months)Will be made available onlineNot ApplicableNTCIR evaluation campaign organizersPartly openSoon to be freely availablewill be freely availableLDCparticipating teamThe original corpus is available from the owner.CD attached with a bookTo be freely-availableMay become available in the future - please monitor website.Will be available, shortlyCopyrighted by ownersIWSLT evaluationSIGHAN-3,4 competitionFreely available for academic useWithin our organizationAvailable from i2b2, need to sign a DUAWill be available in personal web siteWill be published in the futureFrom maker (Prof. Amba Kulkarni)soon available - legal clearance in processFrom Maker (Prof. Peter Scharf)Available in the future through the ETAPE projectThe Speech ArkLinguistic Data Consortiumavailable for scientific purposes, but distribusion is limitedFreely available in Summer, 2011 PolishSwedish Gaelic Greek and Finnish Sinhalapt Arabic de Greek English Basque Telugu Xitsonga Sesotho csavailable at publicationFreely AvalableOnlineNot available at this timenot yet available1M subcorpus freely availablewill be freely available once the user has the license for the original treebankThere are plans to make a stable version freely available (open source).concordances free, full texts in negotiationFree for research purposesto be released at end of projectpublicsee http://lemurproject.org/clueweb09/Can be purchased from ISOWill be available as soon as possible, hopefully before the conference datePublished as a book and DVDFrom Owener starting mid of 2012not finalized yetdifferent availability statuses, in the longer perspective available under open licensesFreely Available (to be confirmed)ELRA and John Benjamins (Book + DVD)As yet undetermined.Freely available for research purpose (see license)online access/parts are freely available from IICTfreely available for search, license needed for downloadthrough LDCStill under developmenton-line platformfree for academic purposesnot publicly availableAccessible via web interfaceFrom two project sitesfrom owner - for research purposesFrom PublisherThis data is released under a slightly modified Creative Commons Attribution Share-Alike license. One must register as a participant in Blizzard Challenge 2012Unavailable due to privacy issuesOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/mtapplication.html)to be checkedOnly dependency information availableto followNot yet decidedtbdin preparationAvailable after project completionfreely available, but subject to requester having obtained the license for the original TIGER treebankWill be made available sooncan be crawled, corpus is not published due to copyright issuesFreely available to those holding a license for the source datasetfor participants in the evaluation campaignOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/dictionaryapplication.html)Corpus interface tool currently under developmentFree for academic usersSearchable, subcorpus available on GNU GPL.Freely available, but subject to having obtained Penn Treebank from LDCavailable to the US governmentIt will be freely available as a web application once completely createdAvailable PDF files (http://nl.ijs.si/isjt10/). However the corpus itself is at this moment not available.Free on request from authorsIn progress. Please contact lead authorshould be available later onfree browsingFreely AvailableFrom Data Center(s)need Mainichi Shinbun '95 corpusFrom OwnerDUCNot AvailableFree Available and our own corpusTo be determined (in the next few months)Will be made available onlineNot ApplicableNTCIR evaluation campaign organizersPartly openSoon to be freely availablewill be freely availableLDCparticipating teamThe original corpus is available from the owner.CD attached with a bookTo be freely-availableMay become available in the future - please monitor website.Will be available, shortlyCopyrighted by ownersIWSLT evaluationSIGHAN-3,4 competitionFreely available for academic useWithin our organizationAvailable from i2b2, need to sign a DUAWill be available in personal web siteWill be published in the futureFrom maker (Prof. Amba Kulkarni)soon available - legal clearance in processFrom Maker (Prof. Peter Scharf)Available in the future through the ETAPE projectThe Speech ArkLinguistic Data Consortiumavailable for scientific purposes, but distribusion is limitedFreely available in Summer, 2011 PolishSwedish Gaelic Greek and Finnish Sinhalapt Arabic de Greek English Basque Telugu Xitsonga Sesotho csavailable at publicationFreely AvalableOnlineNot available at this timenot yet available1M subcorpus freely availablewill be freely available once the user has the license for the original treebankThere are plans to make a stable version freely available (open source).concordances free, full texts in negotiationFree for research purposesto be released at end of projectpublicsee http://lemurproject.org/clueweb09/Can be purchased from ISOWill be available as soon as possible, hopefully before the conference datePublished as a book and DVDFrom Owener starting mid of 2012not finalized yetdifferent availability statuses, in the longer perspective available under open licensesFreely Available (to be confirmed)ELRA and John Benjamins (Book + DVD)As yet undetermined.Freely available for research purpose (see license)online access/parts are freely available from IICTfreely available for search, license needed for downloadthrough LDCStill under developmenton-line platformfree for academic purposesnot publicly availableAccessible via web interfaceFrom two project sitesfrom owner - for research purposesFrom PublisherThis data is released under a slightly modified Creative Commons Attribution Share-Alike license. One must register as a participant in Blizzard Challenge 2012Unavailable due to privacy issuesOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/mtapplication.html)to be checkedOnly dependency information availableto followNot yet decidedtbdin preparationAvailable after project completionfreely available, but subject to requester having obtained the license for the original TIGER treebankWill be made available sooncan be crawled, corpus is not published due to copyright issuesFreely available to those holding a license for the source datasetfor participants in the evaluation campaignOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/dictionaryapplication.html)Corpus interface tool currently under developmentFree for academic usersSearchable, subcorpus available on GNU GPL.Freely available, but subject to having obtained Penn Treebank from LDCavailable to the US governmentIt will be freely available as a web application once completely createdAvailable PDF files (http://nl.ijs.si/isjt10/). However the corpus itself is at this moment not available.Free on request from authorsIn progress. Please contact lead authorshould be available later onfree browsingThe resource will be made available as soon as possible, by means of licensePart of SoNaR projectto be definedavailable with permissionNot finished yetFree for LDC membersTo Be DeterminedNot available yetIn progressavailable for researchNot yet available, but will become freely availableWe will make the treebank available. Undecided about how this will be done.soon available for free download from TuT web site: www.di.unito.it/~tutreebContact authorsOnline query via TXM WEB platform, free upon registrationwill be made availablefreely available upon completionFreely AvailableFrom Data Center(s)need Mainichi Shinbun '95 corpusFrom OwnerDUCNot AvailableFree Available and our own corpusTo be determined (in the next few months)Will be made available onlineNot ApplicableNTCIR evaluation campaign organizersPartly openSoon to be freely availablewill be freely availableLDCparticipating teamThe original corpus is available from the owner.CD attached with a bookTo be freely-availableMay become available in the future - please monitor website.Will be available, shortlyCopyrighted by ownersIWSLT evaluationSIGHAN-3,4 competitionFreely available for academic useWithin our organizationAvailable from i2b2, need to sign a DUAWill be available in personal web siteWill be published in the futureFrom maker (Prof. Amba Kulkarni)soon available - legal clearance in processFrom Maker (Prof. Peter Scharf)Available in the future through the ETAPE projectThe Speech ArkLinguistic Data Consortiumavailable for scientific purposes, but distribusion is limitedFreely available in Summer, 2011 PolishSwedish Gaelic Greek and Finnish Sinhalapt Arabic de Greek English Basque Telugu Xitsonga Sesotho csavailable at publicationFreely AvalableOnlineNot available at this timenot yet available1M subcorpus freely availablewill be freely available once the user has the license for the original treebankThere are plans to make a stable version freely available (open source).concordances free, full texts in negotiationFree for research purposesto be released at end of projectpublicsee http://lemurproject.org/clueweb09/Can be purchased from ISOWill be available as soon as possible, hopefully before the conference datePublished as a book and DVDFrom Owener starting mid of 2012not finalized yetdifferent availability statuses, in the longer perspective available under open licensesFreely Available (to be confirmed)ELRA and John Benjamins (Book + DVD)As yet undetermined.Freely available for research purpose (see license)online access/parts are freely available from IICTfreely available for search, license needed for downloadthrough LDCStill under developmenton-line platformfree for academic purposesnot publicly availableAccessible via web interfaceFrom two project sitesfrom owner - for research purposesFrom PublisherThis data is released under a slightly modified Creative Commons Attribution Share-Alike license. One must register as a participant in Blizzard Challenge 2012Unavailable due to privacy issuesOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/mtapplication.html)to be checkedOnly dependency information availableto followNot yet decidedtbdin preparationAvailable after project completionfreely available, but subject to requester having obtained the license for the original TIGER treebankWill be made available sooncan be crawled, corpus is not published due to copyright issuesFreely available to those holding a license for the source datasetfor participants in the evaluation campaignOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/dictionaryapplication.html)Corpus interface tool currently under developmentFree for academic usersSearchable, subcorpus available on GNU GPL.Freely available, but subject to having obtained Penn Treebank from LDCavailable to the US governmentIt will be freely available as a web application once completely createdAvailable PDF files (http://nl.ijs.si/isjt10/). However the corpus itself is at this moment not available.Free on request from authorsIn progress. Please contact lead authorshould be available later onfree browsingThe resource will be made available as soon as possible, by means of licensePart of SoNaR projectto be definedavailable with permissionNot finished yetFree for LDC membersTo Be DeterminedNot available yetIn progressavailable for researchNot yet available, but will become freely availableWe will make the treebank available. Undecided about how this will be done.soon available for free download from TuT web site: www.di.unito.it/~tutreebContact authorsOnline query via TXM WEB platform, free upon registrationwill be made availablefreely available upon completionFreely AvailableFrom Data Center(s)need Mainichi Shinbun '95 corpusFrom OwnerDUCNot AvailableFree Available and our own corpusTo be determined (in the next few months)Will be made available onlineNot ApplicableNTCIR evaluation campaign organizersPartly openSoon to be freely availablewill be freely availableLDCparticipating teamThe original corpus is available from the owner.CD attached with a bookTo be freely-availableMay become available in the future - please monitor website.Will be available, shortlyCopyrighted by ownersIWSLT evaluationSIGHAN-3,4 competitionFreely available for academic useWithin our organizationAvailable from i2b2, need to sign a DUAWill be available in personal web siteWill be published in the futureFrom maker (Prof. Amba Kulkarni)soon available - legal clearance in processFrom Maker (Prof. Peter Scharf)Available in the future through the ETAPE projectThe Speech ArkLinguistic Data Consortiumavailable for scientific purposes, but distribusion is limitedFreely available in Summer, 2011 PolishSwedish Gaelic Greek and Finnish Sinhalapt Arabic de Greek English Basque Telugu Xitsonga Sesotho csavailable at publicationFreely AvalableOnlineNot available at this timenot yet available1M subcorpus freely availablewill be freely available once the user has the license for the original treebankThere are plans to make a stable version freely available (open source).concordances free, full texts in negotiationFree for research purposesto be released at end of projectpublicsee http://lemurproject.org/clueweb09/Can be purchased from ISOWill be available as soon as possible, hopefully before the conference datePublished as a book and DVDFrom Owener starting mid of 2012not finalized yetdifferent availability statuses, in the longer perspective available under open licensesFreely Available (to be confirmed)ELRA and John Benjamins (Book + DVD)As yet undetermined.Freely available for research purpose (see license)online access/parts are freely available from IICTfreely available for search, license needed for downloadthrough LDCStill under developmenton-line platformfree for academic purposesnot publicly availableAccessible via web interfaceFrom two project sitesfrom owner - for research purposesFrom PublisherThis data is released under a slightly modified Creative Commons Attribution Share-Alike license. One must register as a participant in Blizzard Challenge 2012Unavailable due to privacy issuesOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/mtapplication.html)to be checkedOnly dependency information availableto followNot yet decidedtbdin preparationAvailable after project completionfreely available, but subject to requester having obtained the license for the original TIGER treebankWill be made available sooncan be crawled, corpus is not published due to copyright issuesFreely available to those holding a license for the source datasetfor participants in the evaluation campaignOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/dictionaryapplication.html)Corpus interface tool currently under developmentFree for academic usersSearchable, subcorpus available on GNU GPL.Freely available, but subject to having obtained Penn Treebank from LDCavailable to the US governmentIt will be freely available as a web application once completely createdAvailable PDF files (http://nl.ijs.si/isjt10/). However the corpus itself is at this moment not available.Free on request from authorsIn progress. Please contact lead authorshould be available later onfree browsingThe resource will be made available as soon as possible, by means of licensePart of SoNaR projectto be definedavailable with permissionNot finished yetFree for LDC membersTo Be DeterminedNot available yetIn progressavailable for researchNot yet available, but will become freely availableWe will make the treebank available. Undecided about how this will be done.soon available for free download from TuT web site: www.di.unito.it/~tutreebContact authorsOnline query via TXM WEB platform, free upon registrationwill be made availablefreely available upon completionFreely AvailableFrom Data Center(s)need Mainichi Shinbun '95 corpusFrom OwnerDUCNot AvailableFree Available and our own corpusTo be determined (in the next few months)Will be made available onlineNot ApplicableNTCIR evaluation campaign organizersPartly openSoon to be freely availablewill be freely availableLDCparticipating teamThe original corpus is available from the owner.CD attached with a bookTo be freely-availableMay become available in the future - please monitor website.Will be available, shortlyCopyrighted by ownersIWSLT evaluationSIGHAN-3,4 competitionFreely available for academic useWithin our organizationAvailable from i2b2, need to sign a DUAWill be available in personal web siteWill be published in the futureFrom maker (Prof. Amba Kulkarni)soon available - legal clearance in processFrom Maker (Prof. Peter Scharf)Available in the future through the ETAPE projectThe Speech ArkLinguistic Data Consortiumavailable for scientific purposes, but distribusion is limitedFreely available in Summer, 2011 PolishSwedish Gaelic Greek and Finnish Sinhalapt Arabic de Greek English Basque Telugu Xitsonga Sesotho csavailable at publicationFreely AvalableOnlineNot available at this timenot yet available1M subcorpus freely availablewill be freely available once the user has the license for the original treebankThere are plans to make a stable version freely available (open source).concordances free, full texts in negotiationFree for research purposesto be released at end of projectpublicsee http://lemurproject.org/clueweb09/Can be purchased from ISOWill be available as soon as possible, hopefully before the conference datePublished as a book and DVDFrom Owener starting mid of 2012not finalized yetdifferent availability statuses, in the longer perspective available under open licensesFreely Available (to be confirmed)ELRA and John Benjamins (Book + DVD)As yet undetermined.Freely available for research purpose (see license)online access/parts are freely available from IICTfreely available for search, license needed for downloadthrough LDCStill under developmenton-line platformfree for academic purposesnot publicly availableAccessible via web interfaceFrom two project sitesfrom owner - for research purposesFrom PublisherThis data is released under a slightly modified Creative Commons Attribution Share-Alike license. One must register as a participant in Blizzard Challenge 2012Unavailable due to privacy issuesOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/mtapplication.html)to be checkedOnly dependency information availableto followNot yet decidedtbdin preparationAvailable after project completionfreely available, but subject to requester having obtained the license for the original TIGER treebankWill be made available sooncan be crawled, corpus is not published due to copyright issuesFreely available to those holding a license for the source datasetfor participants in the evaluation campaignOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/dictionaryapplication.html)Corpus interface tool currently under developmentFree for academic usersSearchable, subcorpus available on GNU GPL.Freely available, but subject to having obtained Penn Treebank from LDCavailable to the US governmentIt will be freely available as a web application once completely createdAvailable PDF files (http://nl.ijs.si/isjt10/). However the corpus itself is at this moment not available.Free on request from authorsIn progress. Please contact lead authorshould be available later onfree browsingThe resource will be made available as soon as possible, by means of licensePart of SoNaR projectto be definedavailable with permissionNot finished yetFree for LDC membersTo Be DeterminedNot available yetIn progressavailable for researchNot yet available, but will become freely availableWe will make the treebank available. Undecided about how this will be done.soon available for free download from TuT web site: www.di.unito.it/~tutreebContact authorsOnline query via TXM WEB platform, free upon registrationwill be made availablefreely available upon completionFreely AvailableFrom Data Center(s)need Mainichi Shinbun '95 corpusFrom OwnerDUCNot AvailableFree Available and our own corpusTo be determined (in the next few months)Will be made available onlineNot ApplicableNTCIR evaluation campaign organizersPartly openSoon to be freely availablewill be freely availableLDCparticipating teamThe original corpus is available from the owner.CD attached with a bookTo be freely-availableMay become available in the future - please monitor website.Will be available, shortlyCopyrighted by ownersIWSLT evaluationSIGHAN-3,4 competitionFreely available for academic useWithin our organizationAvailable from i2b2, need to sign a DUAWill be available in personal web siteWill be published in the futureFrom maker (Prof. Amba Kulkarni)soon available - legal clearance in processFrom Maker (Prof. Peter Scharf)Available in the future through the ETAPE projectThe Speech ArkLinguistic Data Consortiumavailable for scientific purposes, but distribusion is limitedFreely available in Summer, 2011As part of the Dutch Parallel Corpus (via the Agency for Human Language Technologies - TST-Centrale)Xerox Research Centre Europe ownerThe submitted LREC article describes parts of a PhD thesis that will be made available via University of Pretoria library (electronic dissertation)Free for non-commercial useMembership subscriptionBook: Ontological Engineering: with examples from the areas of Knowledge Management, e-Commerce and the Semantic Web.Available upon a licenseWe are looking into the availability.Freely available under Creative Commons Licensefreely available, from data centerfrom maintainerWill be available as open source.freely available for searching, subcorpus freely availableacademic license from Lingsoft Oymay become available latervia web browserAwaiting copyright permissionsWill be released as open-source in time for LREC 2010Sprachwissenschaftliches Institut - Ruhr Universität Bochum - GermanyTST centrale (http://www.inl.tst.nl)Fee, Open Source, GPL licensePartialy available on webTo be determinedAvailable with FDL licenseTo be made public soonmostly available on requestThe database will be available via ELDA/ELRA.Released under CC-BY-SACurrently examining the legal status for general or limited distributionDistributed on CDROMNot available yet, but freely browsable soonwill become freely available for research purposeshttp://www.iai-sb.de/iai/index.php/IAI-Produkte.htmlAvailable after completion of PhDdepends on affiliationPermission to make it available is being investigatedWeb interfaceOriginal from owner. Updates by company LingIT.to be made availableMost of the data are available for research from the owner, will be later made available through DK-CLARINFrom owner, when finishedIn principle, available, but the annotations were carried out by one person and have not been cross-checked, so we would rather not distribute it yet.Demo will be available from the owner after the end of the projectlicensed; free-for-researchCurrently available to partner projects onlyAvailable to researchers on-site onlyAvailable from owner for research purposes onlySentiWordNet requires license; we provide free access once you have the datasource installed in your machineavailable soon from the CREAGEST serveravailable for non-profit usefreely available, via licence, for research purpose. Can be purchased for commercial purposes.freely available, upon prior obtention of licence for the French treebankunder discussion with the ownerJoin the evaluation and get the corpusWill be accessible as a web service in 2010Freely available for scientific, non-commercial research purposes through search and analysis software COSMAS IIMust obtain rights for each source terminologyFree for academic uses; obtain from VIDAL SA for 3 yearsfrom DOXA projectDistributed by ELRA/ELDANot available YET, but will be.will be freely available online by the end of the year 2009Large portions will become available in the last quarter of 2010Not yet available. Will be available after completion.Not available for the momentnot yet decidedELDAAvailable for researchers when finishedcommercialDistribution via ELRA under negotiationBeing developed , under deliberationIt will soon be available as free web serviceIt will soon be availablenot freely available, yetBeta version availableFreely available in the near futureAnnotation freely available, text data available from NIST and LDCDutch HLT Agency TST-Centraleavailable at time of publicationcan be consulted via an interfaceWe have not decided on how the treebank should be released to the publicfrom Linguistic Data Consortiumplease contact owner; see belowplease contact ownerNot yet released for public useStill under developmentFujitsu Research and Development CenterPartially IPR-Free documentsAvailable under specific licenseFreely Available (Annotation), From Data Center (Text Data)Not Applicable (Annotation), From Data Center (Text Data)Available for research purposes after the end of the project 2010Will be available in successive versions.Manually compiled from Yahoo! Local websiteManually compiled through Yahoo!'s APIFrom owner / online querieswe are currently checking how we can make available the resourcepassword protectedStraight-forward to reimplementcurrently being used in sponsored projectAvailable online at http://emm.newsexplorer.euAvailable online at http://emm.newsbrief.euonline demonot yet available, as work is in progressAnnotation in progress; when finished, annotations will be freely available and texts downloadable from a data centeravailable upon registration and requestClient Version for Game available for Windowsweb serviceWill be made available after an ungoing consolidation phaseOriginal available from owner, updated version obtained from company LingITCambridge UniversityOn request from the authorsELRADistributed within the electronic version of KORAIS Greek-English dictionaryWill be freely available once completed.under a Creative Commons Attribution - Non Commercial - Share Alike license upon completion of the projectFor registered usersMembership requiredLDC members and licenceesBase corpus (RST-DT) licensed by LDC, TSR annotations freely availableContact Chungdahm Learning, inc. for licensing.through ELDAIt is planned to release a subset of the resource as freely available at the date of the conference.Participants of the CIPS-ParsEval-2009 evaluationWill be available as soon as possibleAvailable from owner only, but will be publically available in the near futureNot available yet, but we plan to make it available to the community in the future.Free online accessCurrently available to MR Evaluation Teams; will be published in LDC catalogCurrently available to TAC KBP program participants only, but will be made available from LDC catalogFreely available for participants of Ester2Freely available gold standard subcorpuscommercial versionfreely accessible via Deriv toolsAvailable for participants registered at CLEFavailable to THESEUS consortiumavailable to the THESEUS consortiumAwaiting copyright permissions: plan to be freely availableWill be freely available when finishedto be seen in accordance to company copyrightswill be made available laterIn preparation for free distributionRCV1 - from owner; labels for NER freely availableavailable for research purposes with a licence from Columbia UniversityFreely usableVia the GF repositoryAn up-to-date draft version of the Reference Model is available under the URL, there has not been an official release, though.ISO copyrightedWill be made availibleCertain parts of the resource is freely availableAvailability scheme not yet finalisednot yet available because still under constructionavailable for academic usefrom author under conditions formulated by the patientsThe corpus will be made freely available once the paper is acceptedStill under development at the LML, IPP, BASonline dictionary freely available via internet; XML-based database and underlying corpus data for internal use onlySketch Engine Website, Wacky ProjectOnce PhD completedSOAS, Endangered Language Archive (open access from 2012Data is copyright protected (BBC)contact Université Paris 8 LSF research teamFree(open source)freely available in the futureFreely available from EurogeneU.S. ArmyProprietarywill become available after further developmentsample (ca. 350 million words) available for querying;some are open, others restricted, some closedDEMO VERSION published in March 2010Usable via web interfaceUsed to be availableWill be made available soonAttribution-Non-Commercial-Share Alike 2.0 licenseFreely available to the academic communitywill be available via http://www.semaine-db.eu/not available at the momentavailable at publicationFreely AvalableOnlineNot available at this timenot yet available1M subcorpus freely availablewill be freely available once the user has the license for the original treebankThere are plans to make a stable version freely available (open source).concordances free, full texts in negotiationFree for research purposesto be released at end of projectpublicsee http://lemurproject.org/clueweb09/Can be purchased from ISOWill be available as soon as possible, hopefully before the conference datePublished as a book and DVDFrom Owener starting mid of 2012not finalized yetdifferent availability statuses, in the longer perspective available under open licensesFreely Available (to be confirmed)ELRA and John Benjamins (Book + DVD)As yet undetermined.Freely available for research purpose (see license)online access/parts are freely available from IICTfreely available for search, license needed for downloadthrough LDCon-line platformfree for academic purposesnot publicly availableAccessible via web interfaceFrom two project sitesfrom owner - for research purposesFrom PublisherThis data is released under a slightly modified Creative Commons Attribution Share-Alike license. One must register as a participant in Blizzard Challenge 2012Unavailable due to privacy issuesOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/mtapplication.html)to be checkedOnly dependency information availableto followtbdin preparationAvailable after project completionfreely available, but subject to requester having obtained the license for the original TIGER treebankcan be crawled, corpus is not published due to copyright issuesFreely available to those holding a license for the source datasetfor participants in the evaluation campaignOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/dictionaryapplication.html)Corpus interface tool currently under developmentFree for academic usersSearchable, subcorpus available on GNU GPL.Freely available, but subject to having obtained Penn Treebank from LDCavailable to the US governmentIt will be freely available as a web application once completely createdAvailable PDF files (http://nl.ijs.si/isjt10/). However the corpus itself is at this moment not available.Free on request from authorsIn progress. Please contact lead authorshould be available later onfree browsingFrom OwnerNot AvailableFrom Data Center(s)Freely AvailableFrom OwnerNot AvailableFree on request from authorsNot ApplicableIn progress. Please contact lead authorshould be available later onfree browsingFreely available for academic useFrom Data Center(s)Freely AvailableFrom OwnerNot AvailableFree on request from authorsNot ApplicableIn progress. Please contact lead authorshould be available later onfree browsingFreely available for academic useFrom Data Center(s)Freely AvailableFrom OwnerNot AvailableFree on request from authorsNot ApplicableIn progress. Please contact lead authorshould be available later onfree browsingFreely available for academic useFrom Data Center(s)Freely AvailableFrom OwnerNot AvailableFree on request from authorsNot ApplicableIn progress. Please contact lead authorshould be available later onfree browsingFreely available for academic useFrom Data Center(s)Freely AvailableFrom OwnerNot AvailableFree on request from authorsNot ApplicableIn progress. Please contact lead authorshould be available later onfree browsingFreely available for academic useFrom Data Center(s)Freely AvailableFrom OwnerNot AvailableFree on request from authorsNot ApplicableIn progress. Please contact lead authorshould be available later onfree browsingFreely available for academic useFrom Data Center(s)Freely AvailableFrom OwnerNot AvailableFree on request from authorsNot ApplicableIn progress. Please contact lead authorshould be available later onfree browsingFreely available for academic useFrom Data Center(s)Freely AvailableFrom OwnerNot AvailableFree on request from authorsNot ApplicableIn progress. Please contact lead authorshould be available later onfree browsingFreely available for academic useFrom Data Center(s)Freely AvailableFrom OwnerWill be published in the futureNot ApplicableNot AvailableFrom maker (Prof. Amba Kulkarni)soon available - legal clearance in processFrom Maker (Prof. Peter Scharf)From Data Center(s)Freely AvailableFrom OwnerWill be published in the futureNot ApplicableNot AvailableFrom maker (Prof. Amba Kulkarni)soon available - legal clearance in processFrom Maker (Prof. Peter Scharf)Free on request from authorsIn progress. Please contact lead authorshould be available later onfree browsingFreely available for academic useFrom Data Center(s)Freely AvailableFrom OwnerWill be published in the futureNot ApplicableNot AvailableFrom maker (Prof. Amba Kulkarni)soon available - legal clearance in processFrom Maker (Prof. Peter Scharf)Free on request from authorsIn progress. Please contact lead authorshould be available later onfree browsingFreely available for academic useavailable at publicationFreely AvalableFrom OwnerOnlineFrom Data Center(s)Not ApplicableNot AvailableNot available at this timenot yet available1M subcorpus freely availablewill be freely available once the user has the license for the original treebankThere are plans to make a stable version freely available (open source).concordances free, full texts in negotiationFree for research purposesto be released at end of projectpublicsee http://lemurproject.org/clueweb09/Can be purchased from ISOWill be available as soon as possible, hopefully before the conference datePublished as a book and DVDFrom Owener starting mid of 2012not finalized yetdifferent availability statuses, in the longer perspective available under open licensesFreely Available (to be confirmed)ELRA and John Benjamins (Book + DVD)As yet undetermined.Freely available for research purpose (see license)online access/parts are freely available from IICTfreely available for search, license needed for downloadthrough LDCStill under developmenton-line platformfree for academic purposesnot publicly availableAccessible via web interfaceFrom two project sitesfrom owner - for research purposesFrom PublisherThis data is released under a slightly modified Creative Commons Attribution Share-Alike license. One must register as a participant in Blizzard Challenge 2012Unavailable due to privacy issuesFreely available for academic useOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/mtapplication.html)to be checkedOnly dependency information availableto followNot yet decidedtbdin preparationAvailable after project completionfreely available, but subject to requester having obtained the license for the original TIGER treebankWill be made available sooncan be crawled, corpus is not published due to copyright issuesFreely available to those holding a license for the source datasetfor participants in the evaluation campaignOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/dictionaryapplication.html)Corpus interface tool currently under developmentFree for academic usersSearchable, subcorpus available on GNU GPL.Freely available, but subject to having obtained Penn Treebank from LDCavailable to the US governmentavailable at publicationFreely AvalableFrom OwnerOnlineFrom Data Center(s)Not ApplicableNot AvailableNot available at this timenot yet available1M subcorpus freely availablewill be freely available once the user has the license for the original treebankThere are plans to make a stable version freely available (open source).concordances free, full texts in negotiationFree for research purposesto be released at end of projectpublicsee http://lemurproject.org/clueweb09/Can be purchased from ISOWill be available as soon as possible, hopefully before the conference datePublished as a book and DVDFrom Owener starting mid of 2012not finalized yetdifferent availability statuses, in the longer perspective available under open licensesFreely Available (to be confirmed)ELRA and John Benjamins (Book + DVD)As yet undetermined.Freely available for research purpose (see license)online access/parts are freely available from IICTfreely available for search, license needed for downloadthrough LDCStill under developmenton-line platformfree for academic purposesnot publicly availableAccessible via web interfaceFrom two project sitesfrom owner - for research purposesFrom PublisherThis data is released under a slightly modified Creative Commons Attribution Share-Alike license. One must register as a participant in Blizzard Challenge 2012Unavailable due to privacy issuesFreely available for academic useOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/mtapplication.html)to be checkedOnly dependency information availableto followNot yet decidedtbdin preparationAvailable after project completionfreely available, but subject to requester having obtained the license for the original TIGER treebankWill be made available sooncan be crawled, corpus is not published due to copyright issuesFreely available to those holding a license for the source datasetfor participants in the evaluation campaignOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/dictionaryapplication.html)Corpus interface tool currently under developmentFree for academic usersSearchable, subcorpus available on GNU GPL.Freely available, but subject to having obtained Penn Treebank from LDCavailable to the US governmentavailable at publicationFreely AvalableFrom OwnerOnlineFrom Data Center(s)Not ApplicableNot AvailableNot available at this timenot yet available1M subcorpus freely availablewill be freely available once the user has the license for the original treebankThere are plans to make a stable version freely available (open source).concordances free, full texts in negotiationFree for research purposesto be released at end of projectpublicsee http://lemurproject.org/clueweb09/Can be purchased from ISOWill be available as soon as possible, hopefully before the conference datePublished as a book and DVDFrom Owener starting mid of 2012not finalized yetdifferent availability statuses, in the longer perspective available under open licensesFreely Available (to be confirmed)ELRA and John Benjamins (Book + DVD)As yet undetermined.Freely available for research purpose (see license)online access/parts are freely available from IICTfreely available for search, license needed for downloadthrough LDCStill under developmenton-line platformfree for academic purposesnot publicly availableAccessible via web interfaceFrom two project sitesfrom owner - for research purposesFrom PublisherThis data is released under a slightly modified Creative Commons Attribution Share-Alike license. One must register as a participant in Blizzard Challenge 2012Unavailable due to privacy issuesFreely available for academic useOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/mtapplication.html)to be checkedOnly dependency information availableto followNot yet decidedtbdin preparationAvailable after project completionfreely available, but subject to requester having obtained the license for the original TIGER treebankWill be made available sooncan be crawled, corpus is not published due to copyright issuesFreely available to those holding a license for the source datasetfor participants in the evaluation campaignOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/dictionaryapplication.html)Corpus interface tool currently under developmentFree for academic usersSearchable, subcorpus available on GNU GPL.Freely available, but subject to having obtained Penn Treebank from LDCavailable to the US governmentavailable at publicationFreely AvalableFrom OwnerOnlineFrom Data Center(s)Not ApplicableNot AvailableNot available at this timenot yet available1M subcorpus freely availablewill be freely available once the user has the license for the original treebankThere are plans to make a stable version freely available (open source).concordances free, full texts in negotiationFree for research purposesto be released at end of projectpublicsee http://lemurproject.org/clueweb09/Can be purchased from ISOWill be available as soon as possible, hopefully before the conference datePublished as a book and DVDFrom Owener starting mid of 2012not finalized yetdifferent availability statuses, in the longer perspective available under open licensesFreely Available (to be confirmed)ELRA and John Benjamins (Book + DVD)As yet undetermined.Freely available for research purpose (see license)online access/parts are freely available from IICTfreely available for search, license needed for downloadthrough LDCStill under developmenton-line platformfree for academic purposesnot publicly availableAccessible via web interfaceFrom two project sitesfrom owner - for research purposesFrom PublisherThis data is released under a slightly modified Creative Commons Attribution Share-Alike license. One must register as a participant in Blizzard Challenge 2012Unavailable due to privacy issuesFreely available for academic useOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/mtapplication.html)to be checkedOnly dependency information availableto followNot yet decidedtbdin preparationAvailable after project completionfreely available, but subject to requester having obtained the license for the original TIGER treebankWill be made available sooncan be crawled, corpus is not published due to copyright issuesFreely available to those holding a license for the source datasetfor participants in the evaluation campaignOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/dictionaryapplication.html)Corpus interface tool currently under developmentFree for academic usersSearchable, subcorpus available on GNU GPL.Freely available, but subject to having obtained Penn Treebank from LDCavailable to the US governmentavailable at publicationFreely AvalableFrom OwnerOnlineFrom Data Center(s)Not ApplicableNot AvailableNot available at this timenot yet available1M subcorpus freely availablewill be freely available once the user has the license for the original treebankThere are plans to make a stable version freely available (open source).concordances free, full texts in negotiationFree for research purposesto be released at end of projectpublicsee http://lemurproject.org/clueweb09/Can be purchased from ISOWill be available as soon as possible, hopefully before the conference datePublished as a book and DVDFrom Owener starting mid of 2012not finalized yetdifferent availability statuses, in the longer perspective available under open licensesFreely Available (to be confirmed)ELRA and John Benjamins (Book + DVD)As yet undetermined.Freely available for research purpose (see license)online access/parts are freely available from IICTfreely available for search, license needed for downloadthrough LDCStill under developmenton-line platformfree for academic purposesnot publicly availableAccessible via web interfaceFrom two project sitesfrom owner - for research purposesFrom PublisherThis data is released under a slightly modified Creative Commons Attribution Share-Alike license. One must register as a participant in Blizzard Challenge 2012Unavailable due to privacy issuesFreely available for academic useOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/mtapplication.html)to be checkedOnly dependency information availableto followNot yet decidedtbdin preparationAvailable after project completionfreely available, but subject to requester having obtained the license for the original TIGER treebankWill be made available sooncan be crawled, corpus is not published due to copyright issuesFreely available to those holding a license for the source datasetfor participants in the evaluation campaignOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/dictionaryapplication.html)Corpus interface tool currently under developmentFree for academic usersSearchable, subcorpus available on GNU GPL.Freely available, but subject to having obtained Penn Treebank from LDCavailable to the US governmentFreely AvailableFrom Data Center(s)need Mainichi Shinbun '95 corpusFrom OwnerDUCNot AvailableFreely AvailableFrom Data Center(s)From OwnerAvailable in the future through the ETAPE projectThe Speech ArkLinguistic Data Consortiumavailable for scientific purposes, but distribusion is limitedFreely available in Summer, 2011Not AvailableFrom OwnerNot AvailableFreely AvailableFrom Data Center(s)need Mainichi Shinbun '95 corpusFrom OwnerDUCNot AvailableWill be published in the futureNot ApplicableFrom maker (Prof. Amba Kulkarni)soon available - legal clearance in processFrom Maker (Prof. Peter Scharf)Available in the future through the ETAPE projectThe Speech ArkLinguistic Data Consortiumavailable for scientific purposes, but distribusion is limitedFreely available in Summer, 2011available at publicationFreely AvalableOnlineNot available at this timenot yet available1M subcorpus freely availablewill be freely available once the user has the license for the original treebankThere are plans to make a stable version freely available (open source).concordances free, full texts in negotiationFree for research purposesto be released at end of projectpublicsee http://lemurproject.org/clueweb09/Can be purchased from ISOWill be available as soon as possible, hopefully before the conference datePublished as a book and DVDFrom Owener starting mid of 2012not finalized yetdifferent availability statuses, in the longer perspective available under open licensesFreely Available (to be confirmed)ELRA and John Benjamins (Book + DVD)As yet undetermined.Freely available for research purpose (see license)online access/parts are freely available from IICTfreely available for search, license needed for downloadthrough LDCStill under developmenton-line platformfree for academic purposesnot publicly availableAccessible via web interfaceFrom two project sitesfrom owner - for research purposesFrom PublisherThis data is released under a slightly modified Creative Commons Attribution Share-Alike license. One must register as a participant in Blizzard Challenge 2012Unavailable due to privacy issuesFreely available for academic useOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/mtapplication.html)to be checkedOnly dependency information availableto followNot yet decidedtbdin preparationAvailable after project completionfreely available, but subject to requester having obtained the license for the original TIGER treebankWill be made available sooncan be crawled, corpus is not published due to copyright issuesFreely available to those holding a license for the source datasetfor participants in the evaluation campaignOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/dictionaryapplication.html)Corpus interface tool currently under developmentFree for academic usersSearchable, subcorpus available on GNU GPL.Freely available, but subject to having obtained Penn Treebank from LDCavailable to the US governmentIt will be freely available as a web application once completely createdAvailable PDF files (http://nl.ijs.si/isjt10/). However the corpus itself is at this moment not available.Free on request from authorsIn progress. Please contact lead authorshould be available later onfree browsingFreely AvailableFrom Data Center(s)need Mainichi Shinbun '95 corpusFrom OwnerDUCNot AvailableWill be published in the futureNot ApplicableFrom maker (Prof. Amba Kulkarni)soon available - legal clearance in processFrom Maker (Prof. Peter Scharf)Available in the future through the ETAPE projectThe Speech ArkLinguistic Data Consortiumavailable for scientific purposes, but distribusion is limitedFreely available in Summer, 2011available at publicationFreely AvalableOnlineNot available at this timenot yet available1M subcorpus freely availablewill be freely available once the user has the license for the original treebankThere are plans to make a stable version freely available (open source).concordances free, full texts in negotiationFree for research purposesto be released at end of projectpublicsee http://lemurproject.org/clueweb09/Can be purchased from ISOWill be available as soon as possible, hopefully before the conference datePublished as a book and DVDFrom Owener starting mid of 2012not finalized yetdifferent availability statuses, in the longer perspective available under open licensesFreely Available (to be confirmed)ELRA and John Benjamins (Book + DVD)As yet undetermined.Freely available for research purpose (see license)online access/parts are freely available from IICTfreely available for search, license needed for downloadthrough LDCStill under developmenton-line platformfree for academic purposesnot publicly availableAccessible via web interfaceFrom two project sitesfrom owner - for research purposesFrom PublisherThis data is released under a slightly modified Creative Commons Attribution Share-Alike license. One must register as a participant in Blizzard Challenge 2012Unavailable due to privacy issuesFreely available for academic useOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/mtapplication.html)to be checkedOnly dependency information availableto followNot yet decidedtbdin preparationAvailable after project completionfreely available, but subject to requester having obtained the license for the original TIGER treebankWill be made available sooncan be crawled, corpus is not published due to copyright issuesFreely available to those holding a license for the source datasetfor participants in the evaluation campaignOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/dictionaryapplication.html)Corpus interface tool currently under developmentFree for academic usersSearchable, subcorpus available on GNU GPL.Freely available, but subject to having obtained Penn Treebank from LDCavailable to the US governmentIt will be freely available as a web application once completely createdAvailable PDF files (http://nl.ijs.si/isjt10/). However the corpus itself is at this moment not available.Free on request from authorsIn progress. Please contact lead authorshould be available later onfree browsingFreely AvailableFrom Data Center(s)need Mainichi Shinbun '95 corpusFrom OwnerDUCNot AvailableWill be published in the futureNot ApplicableFrom maker (Prof. Amba Kulkarni)soon available - legal clearance in processFrom Maker (Prof. Peter Scharf)Available in the future through the ETAPE projectThe Speech ArkLinguistic Data Consortiumavailable for scientific purposes, but distribusion is limitedFreely available in Summer, 2011available at publicationFreely AvalableOnlineNot available at this timenot yet available1M subcorpus freely availablewill be freely available once the user has the license for the original treebankThere are plans to make a stable version freely available (open source).concordances free, full texts in negotiationFree for research purposesto be released at end of projectpublicsee http://lemurproject.org/clueweb09/Can be purchased from ISOWill be available as soon as possible, hopefully before the conference datePublished as a book and DVDFrom Owener starting mid of 2012not finalized yetdifferent availability statuses, in the longer perspective available under open licensesFreely Available (to be confirmed)ELRA and John Benjamins (Book + DVD)As yet undetermined.Freely available for research purpose (see license)online access/parts are freely available from IICTfreely available for search, license needed for downloadthrough LDCStill under developmenton-line platformfree for academic purposesnot publicly availableAccessible via web interfaceFrom two project sitesfrom owner - for research purposesFrom PublisherThis data is released under a slightly modified Creative Commons Attribution Share-Alike license. One must register as a participant in Blizzard Challenge 2012Unavailable due to privacy issuesFreely available for academic useOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/mtapplication.html)to be checkedOnly dependency information availableto followNot yet decidedtbdin preparationAvailable after project completionfreely available, but subject to requester having obtained the license for the original TIGER treebankWill be made available sooncan be crawled, corpus is not published due to copyright issuesFreely available to those holding a license for the source datasetfor participants in the evaluation campaignOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/dictionaryapplication.html)Corpus interface tool currently under developmentFree for academic usersSearchable, subcorpus available on GNU GPL.Freely available, but subject to having obtained Penn Treebank from LDCavailable to the US governmentIt will be freely available as a web application once completely createdAvailable PDF files (http://nl.ijs.si/isjt10/). However the corpus itself is at this moment not available.Free on request from authorsIn progress. Please contact lead authorshould be available later onfree browsingFreely AvailableFree Available and our own corpusFrom OwnerTo be determined (in the next few months)From Data Center(s)Will be made available onlineNot ApplicableNot AvailableNTCIR evaluation campaign organizersPartly openSoon to be freely availablewill be freely availableLDCparticipating teamThe original corpus is available from the owner.CD attached with a bookTo be freely-availableMay become available in the future - please monitor website.Will be available, shortlyCopyrighted by ownersIWSLT evaluationSIGHAN-3,4 competitionFreely available for academic useWithin our organizationAvailable from i2b2, need to sign a DUAWill be available in personal web siteFreely AvailableFree Available and our own corpusFrom OwnerTo be determined (in the next few months)From Data Center(s)Will be made available onlineNot ApplicableNot AvailableNTCIR evaluation campaign organizersPartly openSoon to be freely availablewill be freely availableLDCparticipating teamThe original corpus is available from the owner.CD attached with a bookTo be freely-availableMay become available in the future - please monitor website.Will be available, shortlyCopyrighted by ownersIWSLT evaluationSIGHAN-3,4 competitionFreely available for academic useWithin our organizationAvailable from i2b2, need to sign a DUAWill be available in personal web siteFreely AvailableFree Available and our own corpusFrom OwnerTo be determined (in the next few months)From Data Center(s)Will be made available onlineNot ApplicableNot AvailableNTCIR evaluation campaign organizersPartly openSoon to be freely availablewill be freely availableLDCparticipating teamThe original corpus is available from the owner.CD attached with a bookTo be freely-availableMay become available in the future - please monitor website. PolishWill be available, shortlyCopyrighted by ownersIWSLT evaluationSIGHAN-3,4 competitionSwedishFreely available for academic useWithin our organizationAvailable from i2b2, need to sign a DUAWill be available in personal web site Gaelic Greek and Finnish Sinhalapt Arabic de Greek English Basque Telugu Xitsonga Sesotho csFreely AvailableFrom Data Center(s)need Mainichi Shinbun '95 corpusFrom OwnerDUCNot AvailableFree Available and our own corpusTo be determined (in the next few months)Will be made available onlineNot ApplicableNTCIR evaluation campaign organizersPartly openSoon to be freely availablewill be freely availableLDCparticipating teamThe original corpus is available from the owner.CD attached with a bookTo be freely-availableMay become available in the future - please monitor website.Will be available, shortlyCopyrighted by ownersIWSLT evaluationSIGHAN-3,4 competitionFreely available for academic useWithin our organizationAvailable from i2b2, need to sign a DUAWill be available in personal web siteWill be published in the futureFrom maker (Prof. Amba Kulkarni)soon available - legal clearance in processFrom Maker (Prof. Peter Scharf)Available in the future through the ETAPE projectThe Speech ArkLinguistic Data Consortiumavailable for scientific purposes, but distribusion is limitedFreely available in Summer, 2011 PolishSwedish Gaelic Greek and Finnish Sinhalapt Arabic de Greek English Basque Telugu Xitsonga Sesotho csavailable at publicationFreely AvalableOnlineNot available at this timenot yet available1M subcorpus freely availablewill be freely available once the user has the license for the original treebankThere are plans to make a stable version freely available (open source).concordances free, full texts in negotiationFree for research purposesto be released at end of projectpublicsee http://lemurproject.org/clueweb09/Can be purchased from ISOWill be available as soon as possible, hopefully before the conference datePublished as a book and DVDFrom Owener starting mid of 2012not finalized yetdifferent availability statuses, in the longer perspective available under open licensesFreely Available (to be confirmed)ELRA and John Benjamins (Book + DVD)As yet undetermined.Freely available for research purpose (see license)online access/parts are freely available from IICTfreely available for search, license needed for downloadthrough LDCStill under developmenton-line platformfree for academic purposesnot publicly availableAccessible via web interfaceFrom two project sitesfrom owner - for research purposesFrom PublisherThis data is released under a slightly modified Creative Commons Attribution Share-Alike license. One must register as a participant in Blizzard Challenge 2012Unavailable due to privacy issuesOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/mtapplication.html)to be checkedOnly dependency information availableto followNot yet decidedtbdin preparationAvailable after project completionfreely available, but subject to requester having obtained the license for the original TIGER treebankWill be made available sooncan be crawled, corpus is not published due to copyright issuesFreely available to those holding a license for the source datasetfor participants in the evaluation campaignOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/dictionaryapplication.html)Corpus interface tool currently under developmentFree for academic usersSearchable, subcorpus available on GNU GPL.Freely available, but subject to having obtained Penn Treebank from LDCavailable to the US governmentIt will be freely available as a web application once completely createdAvailable PDF files (http://nl.ijs.si/isjt10/). However the corpus itself is at this moment not available.Free on request from authorsIn progress. Please contact lead authorshould be available later onfree browsingFreely AvailableFrom Data Center(s)need Mainichi Shinbun '95 corpusFrom OwnerDUCNot AvailableFree Available and our own corpusTo be determined (in the next few months)Will be made available onlineNot ApplicableNTCIR evaluation campaign organizersPartly openSoon to be freely availablewill be freely availableLDCparticipating teamThe original corpus is available from the owner.CD attached with a bookTo be freely-availableMay become available in the future - please monitor website.Will be available, shortlyCopyrighted by ownersIWSLT evaluationSIGHAN-3,4 competitionFreely available for academic useWithin our organizationAvailable from i2b2, need to sign a DUAWill be available in personal web siteWill be published in the futureFrom maker (Prof. Amba Kulkarni)soon available - legal clearance in processFrom Maker (Prof. Peter Scharf)Available in the future through the ETAPE projectThe Speech ArkLinguistic Data Consortiumavailable for scientific purposes, but distribusion is limitedFreely available in Summer, 2011 PolishSwedish Gaelic Greek and Finnish Sinhalapt Arabic de Greek English Basque Telugu Xitsonga Sesotho csavailable at publicationFreely AvalableOnlineNot available at this timenot yet available1M subcorpus freely availablewill be freely available once the user has the license for the original treebankThere are plans to make a stable version freely available (open source).concordances free, full texts in negotiationFree for research purposesto be released at end of projectpublicsee http://lemurproject.org/clueweb09/Can be purchased from ISOWill be available as soon as possible, hopefully before the conference datePublished as a book and DVDFrom Owener starting mid of 2012not finalized yetdifferent availability statuses, in the longer perspective available under open licensesFreely Available (to be confirmed)ELRA and John Benjamins (Book + DVD)As yet undetermined.Freely available for research purpose (see license)online access/parts are freely available from IICTfreely available for search, license needed for downloadthrough LDCStill under developmenton-line platformfree for academic purposesnot publicly availableAccessible via web interfaceFrom two project sitesfrom owner - for research purposesFrom PublisherThis data is released under a slightly modified Creative Commons Attribution Share-Alike license. One must register as a participant in Blizzard Challenge 2012Unavailable due to privacy issuesOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/mtapplication.html)to be checkedOnly dependency information availableto followNot yet decidedtbdin preparationAvailable after project completionfreely available, but subject to requester having obtained the license for the original TIGER treebankWill be made available sooncan be crawled, corpus is not published due to copyright issuesFreely available to those holding a license for the source datasetfor participants in the evaluation campaignOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/dictionaryapplication.html)Corpus interface tool currently under developmentFree for academic usersSearchable, subcorpus available on GNU GPL.Freely available, but subject to having obtained Penn Treebank from LDCavailable to the US governmentIt will be freely available as a web application once completely createdAvailable PDF files (http://nl.ijs.si/isjt10/). However the corpus itself is at this moment not available.Free on request from authorsIn progress. Please contact lead authorshould be available later onfree browsingFreely AvailableFrom Data Center(s)need Mainichi Shinbun '95 corpusFrom OwnerDUCNot AvailableFree Available and our own corpusTo be determined (in the next few months)Will be made available onlineNot ApplicableNTCIR evaluation campaign organizersPartly openSoon to be freely availablewill be freely availableLDCparticipating teamThe original corpus is available from the owner.CD attached with a bookTo be freely-availableMay become available in the future - please monitor website.Will be available, shortlyCopyrighted by ownersIWSLT evaluationSIGHAN-3,4 competitionFreely available for academic useWithin our organizationAvailable from i2b2, need to sign a DUAWill be available in personal web siteWill be published in the futureFrom maker (Prof. Amba Kulkarni)soon available - legal clearance in processFrom Maker (Prof. Peter Scharf)Available in the future through the ETAPE projectThe Speech ArkLinguistic Data Consortiumavailable for scientific purposes, but distribusion is limitedFreely available in Summer, 2011 PolishSwedish Gaelic Greek and Finnish Sinhalapt Arabic de Greek English Basque Telugu Xitsonga Sesotho csavailable at publicationFreely AvalableOnlineNot available at this timenot yet available1M subcorpus freely availablewill be freely available once the user has the license for the original treebankThere are plans to make a stable version freely available (open source).concordances free, full texts in negotiationFree for research purposesto be released at end of projectpublicsee http://lemurproject.org/clueweb09/Can be purchased from ISOWill be available as soon as possible, hopefully before the conference datePublished as a book and DVDFrom Owener starting mid of 2012not finalized yetdifferent availability statuses, in the longer perspective available under open licensesFreely Available (to be confirmed)ELRA and John Benjamins (Book + DVD)As yet undetermined.Freely available for research purpose (see license)online access/parts are freely available from IICTfreely available for search, license needed for downloadthrough LDCStill under developmenton-line platformfree for academic purposesnot publicly availableAccessible via web interfaceFrom two project sitesfrom owner - for research purposesFrom PublisherThis data is released under a slightly modified Creative Commons Attribution Share-Alike license. One must register as a participant in Blizzard Challenge 2012Unavailable due to privacy issuesOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/mtapplication.html)to be checkedOnly dependency information availableto followNot yet decidedtbdin preparationAvailable after project completionfreely available, but subject to requester having obtained the license for the original TIGER treebankWill be made available sooncan be crawled, corpus is not published due to copyright issuesFreely available to those holding a license for the source datasetfor participants in the evaluation campaignOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/dictionaryapplication.html)Corpus interface tool currently under developmentFree for academic usersSearchable, subcorpus available on GNU GPL.Freely available, but subject to having obtained Penn Treebank from LDCavailable to the US governmentIt will be freely available as a web application once completely createdAvailable PDF files (http://nl.ijs.si/isjt10/). However the corpus itself is at this moment not available.Free on request from authorsIn progress. Please contact lead authorshould be available later onfree browsingThe resource will be made available as soon as possible, by means of licensePart of SoNaR projectto be definedavailable with permissionNot finished yetFree for LDC membersTo Be DeterminedNot available yetIn progressavailable for researchNot yet available, but will become freely availableWe will make the treebank available. Undecided about how this will be done.soon available for free download from TuT web site: www.di.unito.it/~tutreebContact authorsOnline query via TXM WEB platform, free upon registrationwill be made availablefreely available upon completionFreely AvailableFrom Data Center(s)need Mainichi Shinbun '95 corpusFrom OwnerDUCNot AvailableFree Available and our own corpusTo be determined (in the next few months)Will be made available onlineNot ApplicableNTCIR evaluation campaign organizersPartly openSoon to be freely availablewill be freely availableLDCparticipating teamThe original corpus is available from the owner.CD attached with a bookTo be freely-availableMay become available in the future - please monitor website.Will be available, shortlyCopyrighted by ownersIWSLT evaluationSIGHAN-3,4 competitionFreely available for academic useWithin our organizationAvailable from i2b2, need to sign a DUAWill be available in personal web siteWill be published in the futureFrom maker (Prof. Amba Kulkarni)soon available - legal clearance in processFrom Maker (Prof. Peter Scharf)Available in the future through the ETAPE projectThe Speech ArkLinguistic Data Consortiumavailable for scientific purposes, but distribusion is limitedFreely available in Summer, 2011 PolishSwedish Gaelic Greek and Finnish Sinhalapt Arabic de Greek English Basque Telugu Xitsonga Sesotho csavailable at publicationFreely AvalableOnlineNot available at this timenot yet available1M subcorpus freely availablewill be freely available once the user has the license for the original treebankThere are plans to make a stable version freely available (open source).concordances free, full texts in negotiationFree for research purposesto be released at end of projectpublicsee http://lemurproject.org/clueweb09/Can be purchased from ISOWill be available as soon as possible, hopefully before the conference datePublished as a book and DVDFrom Owener starting mid of 2012not finalized yetdifferent availability statuses, in the longer perspective available under open licensesFreely Available (to be confirmed)ELRA and John Benjamins (Book + DVD)As yet undetermined.Freely available for research purpose (see license)online access/parts are freely available from IICTfreely available for search, license needed for downloadthrough LDCStill under developmenton-line platformfree for academic purposesnot publicly availableAccessible via web interfaceFrom two project sitesfrom owner - for research purposesFrom PublisherThis data is released under a slightly modified Creative Commons Attribution Share-Alike license. One must register as a participant in Blizzard Challenge 2012Unavailable due to privacy issuesOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/mtapplication.html)to be checkedOnly dependency information availableto followNot yet decidedtbdin preparationAvailable after project completionfreely available, but subject to requester having obtained the license for the original TIGER treebankWill be made available sooncan be crawled, corpus is not published due to copyright issuesFreely available to those holding a license for the source datasetfor participants in the evaluation campaignOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/dictionaryapplication.html)Corpus interface tool currently under developmentFree for academic usersSearchable, subcorpus available on GNU GPL.Freely available, but subject to having obtained Penn Treebank from LDCavailable to the US governmentIt will be freely available as a web application once completely createdAvailable PDF files (http://nl.ijs.si/isjt10/). However the corpus itself is at this moment not available.Free on request from authorsIn progress. Please contact lead authorshould be available later onfree browsingThe resource will be made available as soon as possible, by means of licensePart of SoNaR projectto be definedavailable with permissionNot finished yetFree for LDC membersTo Be DeterminedNot available yetIn progressavailable for researchNot yet available, but will become freely availableWe will make the treebank available. Undecided about how this will be done.soon available for free download from TuT web site: www.di.unito.it/~tutreebContact authorsOnline query via TXM WEB platform, free upon registrationwill be made availablefreely available upon completionFreely AvailableFrom Data Center(s)need Mainichi Shinbun '95 corpusFrom OwnerDUCNot AvailableFree Available and our own corpusTo be determined (in the next few months)Will be made available onlineNot ApplicableNTCIR evaluation campaign organizersPartly openSoon to be freely availablewill be freely availableLDCparticipating teamThe original corpus is available from the owner.CD attached with a bookTo be freely-availableMay become available in the future - please monitor website.Will be available, shortlyCopyrighted by ownersIWSLT evaluationSIGHAN-3,4 competitionFreely available for academic useWithin our organizationAvailable from i2b2, need to sign a DUAWill be available in personal web siteWill be published in the futureFrom maker (Prof. Amba Kulkarni)soon available - legal clearance in processFrom Maker (Prof. Peter Scharf)Available in the future through the ETAPE projectThe Speech ArkLinguistic Data Consortiumavailable for scientific purposes, but distribusion is limitedFreely available in Summer, 2011 PolishSwedish Gaelic Greek and Finnish Sinhalapt Arabic de Greek English Basque Telugu Xitsonga Sesotho csavailable at publicationFreely AvalableOnlineNot available at this timenot yet available1M subcorpus freely availablewill be freely available once the user has the license for the original treebankThere are plans to make a stable version freely available (open source).concordances free, full texts in negotiationFree for research purposesto be released at end of projectpublicsee http://lemurproject.org/clueweb09/Can be purchased from ISOWill be available as soon as possible, hopefully before the conference datePublished as a book and DVDFrom Owener starting mid of 2012not finalized yetdifferent availability statuses, in the longer perspective available under open licensesFreely Available (to be confirmed)ELRA and John Benjamins (Book + DVD)As yet undetermined.Freely available for research purpose (see license)online access/parts are freely available from IICTfreely available for search, license needed for downloadthrough LDCStill under developmenton-line platformfree for academic purposesnot publicly availableAccessible via web interfaceFrom two project sitesfrom owner - for research purposesFrom PublisherThis data is released under a slightly modified Creative Commons Attribution Share-Alike license. One must register as a participant in Blizzard Challenge 2012Unavailable due to privacy issuesOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/mtapplication.html)to be checkedOnly dependency information availableto followNot yet decidedtbdin preparationAvailable after project completionfreely available, but subject to requester having obtained the license for the original TIGER treebankWill be made available sooncan be crawled, corpus is not published due to copyright issuesFreely available to those holding a license for the source datasetfor participants in the evaluation campaignOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/dictionaryapplication.html)Corpus interface tool currently under developmentFree for academic usersSearchable, subcorpus available on GNU GPL.Freely available, but subject to having obtained Penn Treebank from LDCavailable to the US governmentIt will be freely available as a web application once completely createdAvailable PDF files (http://nl.ijs.si/isjt10/). However the corpus itself is at this moment not available.Free on request from authorsIn progress. Please contact lead authorshould be available later onfree browsingThe resource will be made available as soon as possible, by means of licensePart of SoNaR projectto be definedavailable with permissionNot finished yetFree for LDC membersTo Be DeterminedNot available yetIn progressavailable for researchNot yet available, but will become freely availableWe will make the treebank available. Undecided about how this will be done.soon available for free download from TuT web site: www.di.unito.it/~tutreebContact authorsOnline query via TXM WEB platform, free upon registrationwill be made availablefreely available upon completionFreely AvailableFrom Data Center(s)need Mainichi Shinbun '95 corpusFrom OwnerDUCNot AvailableFree Available and our own corpusTo be determined (in the next few months)Will be made available onlineNot ApplicableNTCIR evaluation campaign organizersPartly openSoon to be freely availablewill be freely availableLDCparticipating teamThe original corpus is available from the owner.CD attached with a bookTo be freely-availableMay become available in the future - please monitor website.Will be available, shortlyCopyrighted by ownersIWSLT evaluationSIGHAN-3,4 competitionFreely available for academic useWithin our organizationAvailable from i2b2, need to sign a DUAWill be available in personal web siteWill be published in the futureFrom maker (Prof. Amba Kulkarni)soon available - legal clearance in processFrom Maker (Prof. Peter Scharf)Available in the future through the ETAPE projectThe Speech ArkLinguistic Data Consortiumavailable for scientific purposes, but distribusion is limitedFreely available in Summer, 2011 PolishSwedish Gaelic Greek and Finnish Sinhalapt Arabic de Greek English Basque Telugu Xitsonga Sesotho csavailable at publicationFreely AvalableOnlineNot available at this timenot yet available1M subcorpus freely availablewill be freely available once the user has the license for the original treebankThere are plans to make a stable version freely available (open source).concordances free, full texts in negotiationFree for research purposesto be released at end of projectpublicsee http://lemurproject.org/clueweb09/Can be purchased from ISOWill be available as soon as possible, hopefully before the conference datePublished as a book and DVDFrom Owener starting mid of 2012not finalized yetdifferent availability statuses, in the longer perspective available under open licensesFreely Available (to be confirmed)ELRA and John Benjamins (Book + DVD)As yet undetermined.Freely available for research purpose (see license)online access/parts are freely available from IICTfreely available for search, license needed for downloadthrough LDCStill under developmenton-line platformfree for academic purposesnot publicly availableAccessible via web interfaceFrom two project sitesfrom owner - for research purposesFrom PublisherThis data is released under a slightly modified Creative Commons Attribution Share-Alike license. One must register as a participant in Blizzard Challenge 2012Unavailable due to privacy issuesOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/mtapplication.html)to be checkedOnly dependency information availableto followNot yet decidedtbdin preparationAvailable after project completionfreely available, but subject to requester having obtained the license for the original TIGER treebankWill be made available sooncan be crawled, corpus is not published due to copyright issuesFreely available to those holding a license for the source datasetfor participants in the evaluation campaignOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/dictionaryapplication.html)Corpus interface tool currently under developmentFree for academic usersSearchable, subcorpus available on GNU GPL.Freely available, but subject to having obtained Penn Treebank from LDCavailable to the US governmentIt will be freely available as a web application once completely createdAvailable PDF files (http://nl.ijs.si/isjt10/). However the corpus itself is at this moment not available.Free on request from authorsIn progress. Please contact lead authorshould be available later onfree browsingThe resource will be made available as soon as possible, by means of licensePart of SoNaR projectto be definedavailable with permissionNot finished yetFree for LDC membersTo Be DeterminedNot available yetIn progressavailable for researchNot yet available, but will become freely availableWe will make the treebank available. Undecided about how this will be done.soon available for free download from TuT web site: www.di.unito.it/~tutreebContact authorsOnline query via TXM WEB platform, free upon registrationwill be made availablefreely available upon completionFreely AvailableFrom Data Center(s)need Mainichi Shinbun '95 corpusFrom OwnerDUCNot AvailableFree Available and our own corpusTo be determined (in the next few months)Will be made available onlineNot ApplicableNTCIR evaluation campaign organizersPartly openSoon to be freely availablewill be freely availableLDCparticipating teamThe original corpus is available from the owner.CD attached with a bookTo be freely-availableMay become available in the future - please monitor website.Will be available, shortlyCopyrighted by ownersIWSLT evaluationSIGHAN-3,4 competitionFreely available for academic useWithin our organizationAvailable from i2b2, need to sign a DUAWill be available in personal web siteWill be published in the futureFrom maker (Prof. Amba Kulkarni)soon available - legal clearance in processFrom Maker (Prof. Peter Scharf)Available in the future through the ETAPE projectThe Speech ArkLinguistic Data Consortiumavailable for scientific purposes, but distribusion is limitedFreely available in Summer, 2011As part of the Dutch Parallel Corpus (via the Agency for Human Language Technologies - TST-Centrale)Xerox Research Centre Europe ownerThe submitted LREC article describes parts of a PhD thesis that will be made available via University of Pretoria library (electronic dissertation)Free for non-commercial useMembership subscriptionBook: Ontological Engineering: with examples from the areas of Knowledge Management, e-Commerce and the Semantic Web.Available upon a licenseWe are looking into the availability.Freely available under Creative Commons Licensefreely available, from data centerfrom maintainerWill be available as open source.freely available for searching, subcorpus freely availableacademic license from Lingsoft Oymay become available latervia web browserAwaiting copyright permissionsWill be released as open-source in time for LREC 2010Sprachwissenschaftliches Institut - Ruhr Universität Bochum - GermanyTST centrale (http://www.inl.tst.nl)Fee, Open Source, GPL licensePartialy available on webTo be determinedAvailable with FDL licenseTo be made public soonmostly available on requestThe database will be available via ELDA/ELRA.Released under CC-BY-SACurrently examining the legal status for general or limited distributionDistributed on CDROMNot available yet, but freely browsable soonwill become freely available for research purposeshttp://www.iai-sb.de/iai/index.php/IAI-Produkte.htmlAvailable after completion of PhDdepends on affiliationPermission to make it available is being investigatedWeb interfaceOriginal from owner. Updates by company LingIT.to be made availableMost of the data are available for research from the owner, will be later made available through DK-CLARINFrom owner, when finishedIn principle, available, but the annotations were carried out by one person and have not been cross-checked, so we would rather not distribute it yet.Demo will be available from the owner after the end of the projectlicensed; free-for-researchCurrently available to partner projects onlyAvailable to researchers on-site onlyAvailable from owner for research purposes onlySentiWordNet requires license; we provide free access once you have the datasource installed in your machineavailable soon from the CREAGEST serveravailable for non-profit usefreely available, via licence, for research purpose. Can be purchased for commercial purposes.freely available, upon prior obtention of licence for the French treebankunder discussion with the ownerJoin the evaluation and get the corpusWill be accessible as a web service in 2010Freely available for scientific, non-commercial research purposes through search and analysis software COSMAS IIMust obtain rights for each source terminologyFree for academic uses; obtain from VIDAL SA for 3 yearsfrom DOXA projectDistributed by ELRA/ELDANot available YET, but will be.will be freely available online by the end of the year 2009Large portions will become available in the last quarter of 2010Not yet available. Will be available after completion.Not available for the momentnot yet decidedELDAAvailable for researchers when finishedcommercialDistribution via ELRA under negotiationBeing developed , under deliberationIt will soon be available as free web serviceIt will soon be availablenot freely available, yetBeta version availableFreely available in the near futureAnnotation freely available, text data available from NIST and LDCDutch HLT Agency TST-Centraleavailable at time of publicationcan be consulted via an interfaceWe have not decided on how the treebank should be released to the publicfrom Linguistic Data Consortiumplease contact owner; see belowplease contact ownerNot yet released for public useStill under developmentFujitsu Research and Development CenterPartially IPR-Free documentsAvailable under specific licenseFreely Available (Annotation), From Data Center (Text Data)Not Applicable (Annotation), From Data Center (Text Data)Available for research purposes after the end of the project 2010Will be available in successive versions.Manually compiled from Yahoo! Local websiteManually compiled through Yahoo!'s APIFrom owner / online querieswe are currently checking how we can make available the resourcepassword protectedStraight-forward to reimplementcurrently being used in sponsored projectAvailable online at http://emm.newsexplorer.euAvailable online at http://emm.newsbrief.euonline demonot yet available, as work is in progressAnnotation in progress; when finished, annotations will be freely available and texts downloadable from a data centeravailable upon registration and requestClient Version for Game available for Windowsweb serviceWill be made available after an ungoing consolidation phaseOriginal available from owner, updated version obtained from company LingITCambridge UniversityOn request from the authorsELRADistributed within the electronic version of KORAIS Greek-English dictionaryWill be freely available once completed.under a Creative Commons Attribution - Non Commercial - Share Alike license upon completion of the projectFor registered usersMembership requiredLDC members and licenceesBase corpus (RST-DT) licensed by LDC, TSR annotations freely availableContact Chungdahm Learning, inc. for licensing.through ELDAIt is planned to release a subset of the resource as freely available at the date of the conference.Participants of the CIPS-ParsEval-2009 evaluationWill be available as soon as possibleAvailable from owner only, but will be publically available in the near futureNot available yet, but we plan to make it available to the community in the future.Free online accessCurrently available to MR Evaluation Teams; will be published in LDC catalogCurrently available to TAC KBP program participants only, but will be made available from LDC catalogFreely available for participants of Ester2Freely available gold standard subcorpuscommercial versionfreely accessible via Deriv toolsAvailable for participants registered at CLEFavailable to THESEUS consortiumavailable to the THESEUS consortiumAwaiting copyright permissions: plan to be freely availableWill be freely available when finishedto be seen in accordance to company copyrightswill be made available laterIn preparation for free distributionRCV1 - from owner; labels for NER freely availableavailable for research purposes with a licence from Columbia UniversityFreely usableVia the GF repositoryAn up-to-date draft version of the Reference Model is available under the URL, there has not been an official release, though.ISO copyrightedWill be made availibleCertain parts of the resource is freely availableAvailability scheme not yet finalisednot yet available because still under constructionavailable for academic usefrom author under conditions formulated by the patientsThe corpus will be made freely available once the paper is acceptedStill under development at the LML, IPP, BASonline dictionary freely available via internet; XML-based database and underlying corpus data for internal use onlySketch Engine Website, Wacky ProjectOnce PhD completedSOAS, Endangered Language Archive (open access from 2012Data is copyright protected (BBC)contact Université Paris 8 LSF research teamFree(open source)freely available in the futureFreely available from EurogeneU.S. ArmyProprietarywill become available after further developmentsample (ca. 350 million words) available for querying;some are open, others restricted, some closedDEMO VERSION published in March 2010Usable via web interfaceUsed to be availableWill be made available soonAttribution-Non-Commercial-Share Alike 2.0 licenseFreely available to the academic communitywill be available via http://www.semaine-db.eu/not available at the momentavailable at publicationFreely AvalableOnlineNot available at this timenot yet available1M subcorpus freely availablewill be freely available once the user has the license for the original treebankThere are plans to make a stable version freely available (open source).concordances free, full texts in negotiationFree for research purposesto be released at end of projectpublicsee http://lemurproject.org/clueweb09/Can be purchased from ISOWill be available as soon as possible, hopefully before the conference datePublished as a book and DVDFrom Owener starting mid of 2012not finalized yetdifferent availability statuses, in the longer perspective available under open licensesFreely Available (to be confirmed)ELRA and John Benjamins (Book + DVD)As yet undetermined.Freely available for research purpose (see license)online access/parts are freely available from IICTfreely available for search, license needed for downloadthrough LDCon-line platformfree for academic purposesnot publicly availableAccessible via web interfaceFrom two project sitesfrom owner - for research purposesFrom PublisherThis data is released under a slightly modified Creative Commons Attribution Share-Alike license. One must register as a participant in Blizzard Challenge 2012Unavailable due to privacy issuesOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/mtapplication.html)to be checkedOnly dependency information availableto followtbdin preparationAvailable after project completionfreely available, but subject to requester having obtained the license for the original TIGER treebankcan be crawled, corpus is not published due to copyright issuesFreely available to those holding a license for the source datasetfor participants in the evaluation campaignOnline (http://www.nlpresearch-ucsy.edu.mm/NLP_UCSY/dictionaryapplication.html)Corpus interface tool currently under developmentFree for academic usersSearchable, subcorpus available on GNU GPL.Freely available, but subject to having obtained Penn Treebank from LDCavailable to the US governmentIt will be freely available as a web application once completely createdAvailable PDF files (http://nl.ijs.si/isjt10/). However the corpus itself is at this moment not available.Free on request from authorsIn progress. Please contact lead authorshould be available later onfree browsingSubjectivity and sentiment analysisMachine Translation, SpeechToSpeech Translationsemantic role labeling, anaphra resolutionPart-of-speech, chunk and dependency annotated dataword segmentation and part-of-speech taggingIt's a standard source: we have it used it for comparison with our work.hyponymy, synonymyNot used to perform any computational linguistic task, but semiautomatically augmented with the help of non-expert usersparsing evaluationText Complexity Analysis and Text SimplificationDocument Classification, Text categorisationTransliteration miningNamed Entity RecognitionSemantic Role Labeling (Evaluation)DiscourseParsingLanguage ModellingShallow semantic parsingTextual EntailmentSummarisationparser training and testingparser training, parsingConvert PATB to use for dependency parsingMachine Translation, SpeechToSpeech TranslationNot ApplicableSentiment analysisSentiment Analysis and Opinion MiningInformation Extraction, Information RetrievalText MiningNatural Language GenerationSpell Correction or SearchAcquisitionGrammatical Error CorrectionWord Sense DisambiguationAutomatic TreebankDialogueParsing, Multi-word ExpressionsComputer Aided Assessment EvaluationDeception Detectionsyntactic parsing and morphological analysisKnowledge Discovery/RepresentationEvaluationPart-of-speech taggingHuman TranslationEmotion Recognition/GenerationStatistical parsingFormalism-to-formalism corpus conversionlexical substitutiontraining of machine learning algorithms, text-to-speech synthesis, spell-checking, predictive text, grammar checking, machine translation, language researchFor the SSD task we defineDimensionality calculationanalysisMultiword ExpressionQuestion AnsweringtokenizationCoreference ResolutionLanguage IdentificationRetrieval Evaluationsentence boundary detectionUnlabeled corpus of Chinese web pagesParts of Speech Tagging of ManipuriFor stemming Manipuri wordswide coverage dependency grammar constructionSemantic Relatedness evaluationSemantic WebTaggingpart of speech taggingmachine learningsemantic distance metricLanguage modelling, parsing, generationParser trainingPOS taggingPOS tagginSpeech Recognition/UnderstandingLinefeed insertionclause boudary detectionTerm ExtrationESL Error Detection and CorrectionSemantic role labelingEvaluate data for ESE verb error correctionTrain a Percetron modelBuild confusion setTopic Detection and TrackingChinese Syntactic ParsingSyntactic parsingSession DetectionAutomatic evaluation script for morphological analysissmall test dataset for idiomatic/literal sentence classificationdictionary entry parsingtrain parsers, human-robot interactionparaphrase generationFrame-semantic annotationautomatic speech assessmentCompositional Disrtibutional SemanticsSentiment Analysis and many NLP tasks which deal with subjective contentLearner Language AnalysisDependency ParserPlagiarism DetectionSubjectivity AnalysisJava library for interfacing with Wordnetlexical similarity/relatednessPredicting readabilittySemantic resource creationModelling and visualizing version controlled documents.Chinese word segmentationChunkingPerson IdentificationTransliteration test benchExtended for Alignment of Ambiguous Links, Semantic ParsingSemantic Parsing, Language GenerationSemantic Parsing, Language Generation, Alignment of Ambiguous LinksGrammar EngineeringSubjectivity annotationsSpeech SynthesisSentence type identificationPOS, annotationEvaluation of question ranking system for social Q&A sitesEvaluation for sentence alignment algorithm and machine translationWeb ServicesTagging EvaluationTool to restructure sense inventory and annotations of FrameNetautomatic semantic role labelingsupertaggingWord segmentationEntity LinkingCompositionality JudgementsDependency ParsingWord ChoiceLexical Resource AlignmentSemantic Role Labeling, Anaphora ResolutionLearning Logical Structures of Legal TextsSemisupervised LearningMultimedia Document ProcessingTraining Data for Domain AdaptationWord Sense InductionWord space ModellingNear-Synonym ChoicePOS tagging, Chunking, ParsingParse SelectionGeneral Statistical Relational LearningWord Segmentation and POS taggingSpeaker Diarization Speech/speaker recognitionProsodyIntelligibility testing Word alignmentEmotion AnnotationTraining a Statistical POS taggerMorphological AnalysisTesting/EvaluationSource of data for a lexicon associated with a formal grammarlanguage teachingText EngineeringText/MultiText NavigationSpellcheckingevaluation of a reference chain identification modulelanguage technology, translation technology, didactica purposesPOS Tagging and ChunkingAnalysis (of Topology and Content related Properties)Text parsingFor the transfer of a methodology for writing safe and safely translatable alert messages and protocols to other European member statesBy linguists and end users (authors) of alert messages and protocolsCourses to users (authors)Course to linguistsParsing, syntax, evaluationmachine translation, wsd, question answering, etc.word sense disambiguation, named entities recognitionsense disambiguation, named entities recognition, eccword sense disambiguation, named entities recognition, eccany of the included analysis servicesSpelling checking/correctionMorphology, surface syntax(too many uses to specify here)client basedCreation of language models, Training and evaluation of statistic based processing tools or applications, Linguistic research, etc.Creation of language models, Training and evaluation of statistic based processing tools or applications (including stochastic parsers), Linguistic research, etc.Creation of language models, Training and evaluation of statistic based processing tools or applications (including dependency parsers), Linguistic research, etc.Creation of language models, Training and evaluation of statistic based processing tools or applications (including POS-taggers, lemmatizers, morphological analyzers, NERs), Linguistic research, etc.Creation of language models, Training and evaluation of statistic based processing tools or applications (including semantic role labelers), Linguistic research, etc.resource to be used for implementing language-aware editing functions (our intention); it is currently integrated in TextGrid, the Virtual Research Environment for the Humanitiesresource to be used for implementing language-aware editing functions (out intention); originally developed as a resource for spellcheckersresource to be used for implementing language-aware editing functionse-sciencelanguage annotation and learningExtraction of examples from real dataTemporal ReasoningToo broad to specifyPsycholinguistics, Cognitive linguistics, Language AcquisitionNLPtraining and testing of NLP tools (parser)Transliterationresearch in semantics, NLP, IRresearch in morphologysemantically annotated corpus for deverbal nounstagging posTraining parser, input to systemSimplification of texts based on syntactic transformation rulesnaturla language interpretationTool for listening experiment (measure assessment)Lexicon enhancementComplex editor and viewer not only for treebanksQuery engineLinguistic research and training corpus for classifiersAnnotationAnnotation Mining and training of automatic classifierAnnotation Toolbrowsing systemlanguage technology, translation technology, didactical purposesextraction of semanticaly grouped terms from the corpusSupersense taggingNatural Langguage Processingmorphologic annotationannotation of dependencies, annotation of semantic rolesautomatic syllable annotationCorpus Query System Sketch EngineLexicographyVoice ControlAny NLP application using web-scale counts, such as parsing, tagging, coreference, etc.for many different research purposesSpeech analysisLexical representationany natural language processing on RomanianShow Parsing Trees, Opinion AnalysisOpinion AnalysisLanguage learning with special reference to translationtext normalization for corpus creationAge ClassificationWord Net Lexical DevelopmentDevelopment of bilingual translation memoryCzech Morphological TaggerComplex editor, viewer and search engine not only for treebanks.Phonetic researchIR evaluationLRT explorationMorphological AnnotationStatistical Speech/Language ModellingCorpus AnnotationSign Language Recognition/Generationphonetic transcription, lexical modelingGeneral Purpose morphological analyzer and part-of-speech taggerUsed to tag a corpus of health texts.Used to tag a corpus of health textsHealth texts can be used as a basis for evaluating readability assessment toolsNatural Language UnderstandingStatiscal analysisStatistical analysiscreation of automated syllabifierRetrieved most frequent wordsMorphological analysis and pert-of-speech tagging for ArabicA resource for contrastive research into cleftsBilingual lexicon inductionWord aligner evaluationTerminology extractionAutomatic Phonetic Transcriptionmultipurpose: translate (human & MT), postedit, annotate, categorize, evaluatepart-of-speech tagging, and lemmatizationMulti-level annotationsyntactical relation taggingCorpus search and handling toolAs a speech corpus, it can be used for many purposes, from language modelling to speech recognition, and generation.automatic semantic role labeling, word sense disambiguation, machine translation, question answering.parsing, semantic classification, analysis of the syntax-semantics interfacelanguage processingAnnotation of Speech, Discourse, Gestures, Sign Languages, Multimodality etc.Neuroimaging in Language usesMany: syntax, vocab, semanticsQualitative research in oral history and social sciencesDeveloping reference resolution enginesterminology management, dictionary writingCollaborative terminology management, Semantic WebInterpretation; sentiment analysis; psychotherapyHeritage documentation, Language Maintenancetranslation by humanAutomatic Term RecognitionCorpus creationAccessing large amounts of textual dataParaphrasingsign language corpus annotationSign Language Linguistics, SL teaching, gesture researchvarious new services specified by the BPEL workflow languageEnglish POS tagging; English NER annotation; Syriac Morphological annotationQuantitative approaches to linguistic researchProsody analysis and description, prosody prediction for synthesisSystem EvaluationTreebank ViewingTool Developmentdiscourse understandingTreebank CreationStandardCreation of the treebankinput to the conversion into dependenciesto extract annotation rules for the ontology-based semantic annotationMorphology, shallow and deep syntax, valency, coreferenceWord ClassificationFeature ExtractionTool/framework used for creating and testing JAPE RulesEvaluation methodologyWord Sense Disambiguation, Creation of Domain wordnetsfact mining, crosslingual knowledge sharingDomain Wordnet CreationEvaluation of morphological analyser(s)LemmatizationLinguistic ResearchPrior PolarityPolarity lexiconUsual (multiple) uses of controlled vocabulariesAll uses of an inflectional lexiconInformation retrieval and machine translationContent extractionSpoken language documentationBasic NLP Tool, for many usesSemantic System EvaluationParser DevelopmentLinguistic and NLP studiesMost of the usages here but I cannot select more than oneMany of the usages here but I cannot select more than onePOS tagging, lemmatization, LM, etc.Cross-linguistic comparisonDictionaryMorpho-semantic analysisSynonym and derivational variants recognitionParaphrase extractionDevelopment and deployment of morphology and syntax grammarsHuman Behavior Detection and UnderstandingHandwriting RecognitionDiscourse AnnotationtranslationSemantic Role Labeling (SRL)Comparison to our produced resource + part of learning dataresearch infrastructureAccessing and Editing Language Resourcesmapping corpus annotation scheme onto different frameworkvalidation of produced structuresidentifying valency modifications in PDT treesVocabulary learning, language productionParsing, DisambiguationTreebank for Dependency ParsingSpecifications for NLP resourcesAdaptationtokenization, sentence splitting, PoS-taggingsentence design of GRID was adoptedSemantic similarityPOS-taggingPOS-tagging, lemmatisationdistributed, collaborative text annotationEditing emergency domain texts in order to make them clear to understandLinguistic and language technology researchGenerates coarser grained versions of FrameNet data. Useful for shallow semantic parsing and other applications.partial parsingmorphosyntactic analysispartial parsing, taggingmorphological analysis and generationSyntactic AnnotationCreating dictionaries based on word alignmentCreating parallel corporaTool-chain for basic text analysis including morphological disambiguationto train a classifier for automatic assignment of wh-questions to verbal arguments in a system conceived to help poor-literacy readers to perform detailed text readingGeneral Natural Language ProcessingWorkflow ManagementstandardizationNatural Language AnalysisTerminology ManagementComputer-Aided Language LearningA set of metaphorical mappings and categories from the list can be used for both empirical studies of metaphorical associations and the development of computational models of metaphorAutomatic metaphor recognition and interpretation experiments; an empirical study of metaphorical associationstestingPhrase-Structure corpus analysissummarization, metonymy resolution, othersTo Develop Gujarati Morphological AnalyserA Listrr of Hindi base-forms used with Hindi Morphological AnalyserTo Learn Suffix-Replacement Rules for Morphological AnalyserA Program for Developing Morphological Analysers for South Asian Languagesvarious usestext stylisticsBilingual dictionary creationQuestionnairespeech disorder analysisGrammar developmentautomatic text analysis at morphological, analytical and tectogrammatical layers of descriptionDiverseNumber Sense Dismbiguation (similar to Word Sense Disambiguation)Raw, unannotated email textpsycholinguistic researchhuman-robot-interaction, robust parsingLarge Treebank for Hindi Language Processingnavigation, visualization/listening and extraction of some (semantic, sociolinguistic, etc.) information using the corpus itself (transcription/sound files) and the data base with the metadata describing the corpus).Integration with e-learning environmentsGold standard for adjective classificationmorphological derivationSyntactic parserontology matchingTo obtain test datawriting supportExtracting semantic relationsNoun phrase extractionOpinion analysis, media content analysisCoercion/Metonymy recognitionFormal knowledge and information reference for various use casesIntegrating Deep & Shallow NLPIntegrating language resources as web servicessemantic annotation, semantic searchannotation, machine learning, semantic search, testingImprovment the automatic identification of (idiomatic) Multiword ExpressionsValidation of idiomatic expressionsExtraction of Multiword ExpressionsPart of Speech Tagging, ICALLExtraction of gold Stanford dependenciesPrepare data for some of the downstream parsersSource data for gold Stanford dependenciesOpinion miningPre-processing, TokenizerCorpus annotation and searchParsing, Evaluation.Propbank frameset creationcorpus creation (this may be the same as acquisition)Voice Activity Detectionparaphrasing, text composition, authoring aid, controlled language, MT pre-editing, web serviceparaphrasing, text composition, authoring aid, controlled language, MT pre-editingmulti-purposeparaphrasing, resourcesAll applications using annotated corporaGrammar induction from positive examplesComputer Assisted Language LearningValence lexiconsentence alignmentAnaphora ResolutionSubtitle generationverification of syntactic structureResearch on Information Structure, Syntax, ProsodyLexiconPhone model trainingModel - signal alignmentLanguage researchEducationalMulti-Language lexical resource extension, question answering, textual entailmentnatural language processing tasks (multiple), speech recognition, multimedia processing and generation, knowledge representation, computer vision tasks (e.g. object recognition, event recognition), and computational analysis of kinematicsSpontaneous speech descriptionCorpora annotationfor text prepocessingStandard text preparationrecommendationLexicography, Statistical NLP Tasks, Applied and Theoritical NLPFrame Semantic Annotationmorphological annotation/tagging of a corpusIn all text processing applicationsfor text preprocessingcomparison with Czech Web Corpussemantic relatednessWeb-based Data Collection of SpeechText AnnotationAnnotation Evaluation, Statistical TrainingSemantic Role Labellingword sense disambiguation, semantic relations, ontologymorphosyntactic tagging, lemmatization, chunkingMorphosyntactic taggingcleaning web pages, tokenization, sentence splitting, morphological analysis, Part-of-Speech tagging, phrase chunking, Named Entity recognition, lemmatization, multiword recognitionCleaning-up lexical resoucesAnaphora, CoreferenceCreating Verb LexiconLemmatizingModeling Social Phenomena in DiscourseMachine ReadingPlagiarismManual correction of named-entity pre-annotated corpusWordnetsFrameNet-based Shallow Semantic ParsingAnswer ValidationPropbank instance creationtransliteration similarity estimationAnnotation and grounding of toponyms in image captionsManual annotation and grounding of toponyms in image captionsEvaluation tool used for text alignment and annotationDesigned to geo-tag image captions to aid image indexing and retrievalThe resource is valuable for both linguists and NLP researchersmapping language resources and usersWeb-based applications, Information ExtractionParts of Speech Taggingquerying wordnet-like lexical databasescontrastive studies, language teaching, translation teachingPropbank frameset editorPropbank instance editorExperimental Research, Language learning, modellingmany purposesallsupporting infrastructureSpeaking avatarText analyticsTagging of Early Modern German corpusAutomatic annotation, manual correction of annotationCorpus-linguistic investigations, comparative studies with other historical corporamultiple uses, primarily meant as translation toolsmultiple usesGeneral metadata, potentially useful for a wide rage of applicationsstandardized, general accessible data category descriptionConverting linguistic data into several formatsstandardized format for linguistic datastandardized representation of fetauresDictionary definitionchunking, parsingdevelopment of plug-inHuman communication analysis, emotion recognition, theatrical improvisation analysisAnaysis of cognitive processes in the mathematics classroomDominanceEstimationa proof of concept for the proposed algorithmsGaze estimationGeneral LT infrastructuredocument classifcation; emotion recognition; multimedia document processing; language identification; person identification; voice control; web services; forensicsCorpus generationGenerate structured values (not just named entities)As a framework for NLP module and pipeline development / hostingAvatar modelling and rigging for signingMultiple usage scenariosCorpus-based language description (grammar) and anlaysisStudy of semantic relations between words and signssecond language learningcommon annotation scheme for Sign Language unitsLSF descriptionSign Language descriptionSign Language Corpora AnnotationTagging and lemmatizationinteractive labeling to contruct statistical face modelsDevelopment of corporatext entryMyanmar Word SegmentationName MatchingMachine Learning, Automatic Term Recognitionpart-of-speech annotationall kinds of research usagesLRT descriptionComputational Lexicographyspell-checker evaluationPOS-tagging, parsing, corpus linguisticsCorpus constructionTokenisation, PoS tagging, Parsingcan be both WSD and information extractionemotion recognition, annotation, analysis of meaning and disambiguation of affective terms according to the recorded affective speechSentiment Analysis, Network AnalysisDependency Parsing OptimizationDependency Parser GeneratorTranslation error analysis evaluationModeling lexical resources, NLP lexiconsResource-based NLP applications, e.g. Word Sense DisambiguationLanguage DocumentationAnnotation standard/guidelines, validation schema, conversion toolsDiachronic studieslanguage evaluationLanguage learningscientific research - language acquisitionTreebankLinguistic and NLP ResearchString comparisoneye trackingAutomatic Phonetic Transcription and Segmentationcorpus compilationLexicography; CALL;POS tagging, LemmatizationSemanticsdictionary look-upCompositionality of MWEsNatural Language Analysis and Production, Tagging, Machine Translation, Context Sensitive Correction, etc.writing processClustering and statistical calculationsTerm extractionTemporal Extraction and NormalizationArchivingSentiment Analysis, Opinion ClassificationText Normalizationresearch: phonetics, psycholinguistics,artificial intelligenceeducation, lexicography, contrastive studiesTemporal Processingmorphology and syntaxcomplex evaluation of machine translationconstituent-to-dependency conversionAnnotation of various phenomenaparsing, parallel parsing, machine translation, coreference resolution, anaphora resolution, natural language generation, lexical acquisitionCross-linguistic investigationMorphological analysis/synthesisMachine Learning, Shalow Parsing, Named Entity Recognition, Word Sense DisambiguationTimeML annotationStimuliLanguage Learingknowledge acquisition and text production in the environmental domainsemantic annotationSemantic role labeling, Verb sense disambiguationSubsentential alignment and terminology extractionAnnotation and EducationalCould be used in a variety of applications.Transliteration dataHistorical Linguistics, Pragmatics, General Linguisticsbasically any NLP application/corpus using linguistic annotations can make us of this resourceCorpus-based language descriptionCorpus Query Systemfor text search and LT projectstarget of dictionary look-upSupports multi-disciplinary researchName Entities Annotation, WSD Annotation, Relation Annotation, Anaphora AnnotationText alignmentSyntactic Researchany natural language processing task on RomanianProsodic analysisNeologism detectionGeneralNatural Language Processingtackling interoperability issues within UIMA workflowsSemantics and Pragmatics; cross-linguistic and diachronic studiesNLP Tool provided as serviceLogical MetonymyDialect researchMultipurpose research corpusResource discoveryText simplificationterminological resourceSMSMultimodal Communication AnalysisThe annotations provided by the resource may be used by any application.Annotation of multimodal multimedia recordingsNumerous: sentiment, NE extraction, IR etcNamed Entity Disambiguationannotation processingTranslation StudiesMetadata editorError Detection and CorrectionInteraction between lexical resourcesInternet language normalizationWeb-based Data CollectionTrain Deep Syntactic ParsersSyntactic Parsing, Discourse ParsingIdiom detection / classificationSentiment Analysis, Event DetectionContrastive semantic studies, teaching aidsSyntax, grammarparser evaluationlanguage proofingthesaurus buildingSentiment LexiconMorphological Taggingmultipurposeconverts constituent trees to dependency treesTerm selectionFocused crawlerText Complexity AnalysisAnnotation, language modelling, encoding detection etc.Term candidates validationsyntax, grammar, documentationHistorical LinguisticsDatabase schemaCorpus-based language description, Error Analysissemantic class definitionIntra Chunk Dependency ParsingDistributional Semanticssyntaxico-semantic annotationsentence splitting, tokenisation, syntax analysissyntactic analysissolve mixed integer linear programmany: lexicography, linguistic research, language teachng, WSD, ...Text processing TerminologyStatistical phrase alignmentText parsing and Tagging Pronunciation AssessmentSpoken Term DetectionTo develop speech recognition system tailored for the persons disabled in articulationParsing Evaluation (NEW VALUE)Text analysisTaggerPOS tagging, domain adaptationTaxonomy Generation (NEW VALUE)Subcategorization Frames Extraction (NEW VALUE)Unsupervised part-of-speech tagging (NEW VALUE)Shallow parsing (NEW VALUE)Subjectivity AnalysisSemantic Role Labeling, Anaphora ResolutionPOS tagging, Chunking, ParsingWord Segmentation and POS taggingEvaluationExtracting semantic relationsLexicon enhancementParsing Evaluation (NEW VALUE)Text analysisDocument Classification, Text categorisationTransliteration miningNamed Entity RecognitionSemantic Role Labeling (Evaluation)DiscourseParsingLanguage ModellingShallow semantic parsingTextual EntailmentSummarisationparser training and testingparser training, parsingConvert PATB to use for dependency parsingMachine Translation, SpeechToSpeech TranslationNot ApplicableSentiment analysisSentiment Analysis and Opinion MiningInformation Extraction, Information RetrievalText MiningNatural Language GenerationSpell Correction or SearchAcquisitionGrammatical Error CorrectionWord Sense DisambiguationAutomatic TreebankDialogueParsing, Multi-word ExpressionsComputer Aided Assessment EvaluationDeception Detectionsyntactic parsing and morphological analysisKnowledge Discovery/RepresentationEvaluationPart-of-speech taggingHuman TranslationEmotion Recognition/GenerationStatistical parsingFormalism-to-formalism corpus conversionlexical substitutiontraining of machine learning algorithms, text-to-speech synthesis, spell-checking, predictive text, grammar checking, machine translation, language researchFor the SSD task we defineDimensionality calculationanalysisMultiword ExpressionQuestion AnsweringtokenizationCoreference ResolutionLanguage IdentificationRetrieval Evaluationsentence boundary detectionUnlabeled corpus of Chinese web pagesParts of Speech Tagging of ManipuriFor stemming Manipuri wordswide coverage dependency grammar constructionSemantic Relatedness evaluationSemantic WebTaggingpart of speech taggingmachine learningsemantic distance metricLanguage modelling, parsing, generationParser trainingPOS taggingPOS tagginSpeech Recognition/UnderstandingLinefeed insertionclause boudary detectionTerm ExtrationESL Error Detection and CorrectionSemantic role labelingEvaluate data for ESE verb error correctionTrain a Percetron modelBuild confusion setTopic Detection and TrackingChinese Syntactic ParsingSyntactic parsingSession DetectionAutomatic evaluation script for morphological analysissmall test dataset for idiomatic/literal sentence classificationdictionary entry parsingtrain parsers, human-robot interactionparaphrase generationFrame-semantic annotationautomatic speech assessmentCompositional Disrtibutional SemanticsSentiment Analysis and many NLP tasks which deal with subjective contentLearner Language AnalysisDependency ParserPlagiarism DetectionSubjectivity AnalysisJava library for interfacing with Wordnetlexical similarity/relatednessPredicting readabilittySemantic resource creationModelling and visualizing version controlled documents.Chinese word segmentationChunkingPerson IdentificationTransliteration test benchExtended for Alignment of Ambiguous Links, Semantic ParsingSemantic Parsing, Language GenerationSemantic Parsing, Language Generation, Alignment of Ambiguous LinksGrammar EngineeringSubjectivity annotationsSpeech SynthesisSentence type identificationPOS, annotationEvaluation of question ranking system for social Q&A sitesEvaluation for sentence alignment algorithm and machine translationWeb ServicesTagging EvaluationTool to restructure sense inventory and annotations of FrameNetautomatic semantic role labelingsupertaggingWord segmentationEntity LinkingCompositionality JudgementsDependency ParsingWord ChoiceLexical Resource AlignmentSemantic Role Labeling, Anaphora ResolutionLearning Logical Structures of Legal TextsSemisupervised LearningMultimedia Document ProcessingTraining Data for Domain AdaptationWord Sense InductionWord space ModellingNear-Synonym ChoicePOS tagging, Chunking, ParsingParse SelectionGeneral Statistical Relational LearningWord Segmentation and POS taggingSpeaker Diarization Speech/speaker recognitionProsodyIntelligibility testing Word alignmentEmotion AnnotationTraining a Statistical POS taggerMorphological AnalysisTesting/EvaluationSource of data for a lexicon associated with a formal grammarlanguage teachingText EngineeringText/MultiText NavigationSpellcheckingevaluation of a reference chain identification modulelanguage technology, translation technology, didactica purposesPOS Tagging and ChunkingAnalysis (of Topology and Content related Properties)Text parsingFor the transfer of a methodology for writing safe and safely translatable alert messages and protocols to other European member statesBy linguists and end users (authors) of alert messages and protocolsCourses to users (authors)Course to linguistsParsing, syntax, evaluationmachine translation, wsd, question answering, etc.word sense disambiguation, named entities recognitionsense disambiguation, named entities recognition, eccword sense disambiguation, named entities recognition, eccany of the included analysis servicesSpelling checking/correctionMorphology, surface syntax(too many uses to specify here)client basedCreation of language models, Training and evaluation of statistic based processing tools or applications, Linguistic research, etc.Creation of language models, Training and evaluation of statistic based processing tools or applications (including stochastic parsers), Linguistic research, etc.Creation of language models, Training and evaluation of statistic based processing tools or applications (including dependency parsers), Linguistic research, etc.Creation of language models, Training and evaluation of statistic based processing tools or applications (including POS-taggers, lemmatizers, morphological analyzers, NERs), Linguistic research, etc.Creation of language models, Training and evaluation of statistic based processing tools or applications (including semantic role labelers), Linguistic research, etc.resource to be used for implementing language-aware editing functions (our intention); it is currently integrated in TextGrid, the Virtual Research Environment for the Humanitiesresource to be used for implementing language-aware editing functions (out intention); originally developed as a resource for spellcheckersresource to be used for implementing language-aware editing functionse-sciencelanguage annotation and learningExtraction of examples from real dataTemporal ReasoningToo broad to specifyPsycholinguistics, Cognitive linguistics, Language AcquisitionNLPtraining and testing of NLP tools (parser)Transliterationresearch in semantics, NLP, IRresearch in morphologysemantically annotated corpus for deverbal nounstagging posTraining parser, input to systemSimplification of texts based on syntactic transformation rulesnaturla language interpretationTool for listening experiment (measure assessment)Lexicon enhancementComplex editor and viewer not only for treebanksQuery engineLinguistic research and training corpus for classifiersAnnotationAnnotation Mining and training of automatic classifierAnnotation Toolbrowsing systemlanguage technology, translation technology, didactical purposesextraction of semanticaly grouped terms from the corpusSupersense taggingNatural Langguage Processingmorphologic annotationannotation of dependencies, annotation of semantic rolesautomatic syllable annotationCorpus Query System Sketch EngineLexicographyVoice ControlAny NLP application using web-scale counts, such as parsing, tagging, coreference, etc.for many different research purposesSpeech analysisLexical representationany natural language processing on RomanianShow Parsing Trees, Opinion AnalysisOpinion AnalysisLanguage learning with special reference to translationtext normalization for corpus creationAge ClassificationWord Net Lexical DevelopmentDevelopment of bilingual translation memoryCzech Morphological TaggerComplex editor, viewer and search engine not only for treebanks.Phonetic researchIR evaluationLRT explorationMorphological AnnotationStatistical Speech/Language ModellingCorpus AnnotationSign Language Recognition/Generationphonetic transcription, lexical modelingGeneral Purpose morphological analyzer and part-of-speech taggerUsed to tag a corpus of health texts.Used to tag a corpus of health textsHealth texts can be used as a basis for evaluating readability assessment toolsNatural Language UnderstandingStatiscal analysisStatistical analysiscreation of automated syllabifierRetrieved most frequent wordsMorphological analysis and pert-of-speech tagging for ArabicA resource for contrastive research into cleftsBilingual lexicon inductionWord aligner evaluationTerminology extractionAutomatic Phonetic Transcriptionmultipurpose: translate (human & MT), postedit, annotate, categorize, evaluatepart-of-speech tagging, and lemmatizationMulti-level annotationsyntactical relation taggingCorpus search and handling toolAs a speech corpus, it can be used for many purposes, from language modelling to speech recognition, and generation.automatic semantic role labeling, word sense disambiguation, machine translation, question answering.parsing, semantic classification, analysis of the syntax-semantics interfacelanguage processingAnnotation of Speech, Discourse, Gestures, Sign Languages, Multimodality etc.Neuroimaging in Language usesMany: syntax, vocab, semanticsQualitative research in oral history and social sciencesDeveloping reference resolution enginesterminology management, dictionary writingCollaborative terminology management, Semantic WebInterpretation; sentiment analysis; psychotherapyHeritage documentation, Language Maintenancetranslation by humanAutomatic Term RecognitionCorpus creationAccessing large amounts of textual dataParaphrasingsign language corpus annotationSign Language Linguistics, SL teaching, gesture researchvarious new services specified by the BPEL workflow languageEnglish POS tagging; English NER annotation; Syriac Morphological annotationQuantitative approaches to linguistic researchProsody analysis and description, prosody prediction for synthesisSystem EvaluationTreebank ViewingTool Developmentdiscourse understandingTreebank CreationStandardCreation of the treebankinput to the conversion into dependenciesto extract annotation rules for the ontology-based semantic annotationMorphology, shallow and deep syntax, valency, coreferenceWord ClassificationFeature ExtractionTool/framework used for creating and testing JAPE RulesEvaluation methodologyWord Sense Disambiguation, Creation of Domain wordnetsfact mining, crosslingual knowledge sharingDomain Wordnet CreationEvaluation of morphological analyser(s)LemmatizationLinguistic ResearchPrior PolarityPolarity lexiconUsual (multiple) uses of controlled vocabulariesAll uses of an inflectional lexiconInformation retrieval and machine translationContent extractionSpoken language documentationBasic NLP Tool, for many usesSemantic System EvaluationParser DevelopmentLinguistic and NLP studiesMost of the usages here but I cannot select more than oneMany of the usages here but I cannot select more than onePOS tagging, lemmatization, LM, etc.Cross-linguistic comparisonDictionaryMorpho-semantic analysisSynonym and derivational variants recognitionParaphrase extractionDevelopment and deployment of morphology and syntax grammarsHuman Behavior Detection and UnderstandingHandwriting RecognitionDiscourse AnnotationtranslationSemantic Role Labeling (SRL)Comparison to our produced resource + part of learning dataresearch infrastructureAccessing and Editing Language Resourcesmapping corpus annotation scheme onto different frameworkvalidation of produced structuresidentifying valency modifications in PDT treesVocabulary learning, language productionParsing, DisambiguationTreebank for Dependency ParsingSpecifications for NLP resourcesAdaptationtokenization, sentence splitting, PoS-taggingsentence design of GRID was adoptedSemantic similarityPOS-taggingPOS-tagging, lemmatisationdistributed, collaborative text annotationEditing emergency domain texts in order to make them clear to understandLinguistic and language technology researchGenerates coarser grained versions of FrameNet data. Useful for shallow semantic parsing and other applications.partial parsingmorphosyntactic analysispartial parsing, taggingmorphological analysis and generationSyntactic AnnotationCreating dictionaries based on word alignmentCreating parallel corporaTool-chain for basic text analysis including morphological disambiguationto train a classifier for automatic assignment of wh-questions to verbal arguments in a system conceived to help poor-literacy readers to perform detailed text readingGeneral Natural Language ProcessingWorkflow ManagementstandardizationNatural Language AnalysisTerminology ManagementComputer-Aided Language LearningA set of metaphorical mappings and categories from the list can be used for both empirical studies of metaphorical associations and the development of computational models of metaphorAutomatic metaphor recognition and interpretation experiments; an empirical study of metaphorical associationstestingPhrase-Structure corpus analysissummarization, metonymy resolution, othersTo Develop Gujarati Morphological AnalyserA Listrr of Hindi base-forms used with Hindi Morphological AnalyserTo Learn Suffix-Replacement Rules for Morphological AnalyserA Program for Developing Morphological Analysers for South Asian Languagesvarious usestext stylisticsBilingual dictionary creationQuestionnairespeech disorder analysisGrammar developmentautomatic text analysis at morphological, analytical and tectogrammatical layers of descriptionDiverseNumber Sense Dismbiguation (similar to Word Sense Disambiguation)Raw, unannotated email textpsycholinguistic researchhuman-robot-interaction, robust parsingLarge Treebank for Hindi Language Processingnavigation, visualization/listening and extraction of some (semantic, sociolinguistic, etc.) information using the corpus itself (transcription/sound files) and the data base with the metadata describing the corpus).Integration with e-learning environmentsGold standard for adjective classificationmorphological derivationSyntactic parserontology matchingTo obtain test datawriting supportExtracting semantic relationsNoun phrase extractionOpinion analysis, media content analysisCoercion/Metonymy recognitionFormal knowledge and information reference for various use casesIntegrating Deep & Shallow NLPIntegrating language resources as web servicessemantic annotation, semantic searchannotation, machine learning, semantic search, testingImprovment the automatic identification of (idiomatic) Multiword ExpressionsValidation of idiomatic expressionsExtraction of Multiword ExpressionsPart of Speech Tagging, ICALLExtraction of gold Stanford dependenciesPrepare data for some of the downstream parsersSource data for gold Stanford dependenciesOpinion miningPre-processing, TokenizerCorpus annotation and searchParsing, Evaluation.Propbank frameset creationcorpus creation (this may be the same as acquisition)Voice Activity Detectionparaphrasing, text composition, authoring aid, controlled language, MT pre-editing, web serviceparaphrasing, text composition, authoring aid, controlled language, MT pre-editingmulti-purposeparaphrasing, resourcesAll applications using annotated corporaGrammar induction from positive examplesComputer Assisted Language LearningValence lexiconsentence alignmentAnaphora ResolutionSubtitle generationverification of syntactic structureResearch on Information Structure, Syntax, ProsodyLexiconPhone model trainingModel - signal alignmentLanguage researchEducationalMulti-Language lexical resource extension, question answering, textual entailmentnatural language processing tasks (multiple), speech recognition, multimedia processing and generation, knowledge representation, computer vision tasks (e.g. object recognition, event recognition), and computational analysis of kinematicsSpontaneous speech descriptionCorpora annotationfor text prepocessingStandard text preparationrecommendationLexicography, Statistical NLP Tasks, Applied and Theoritical NLPFrame Semantic Annotationmorphological annotation/tagging of a corpusIn all text processing applicationsfor text preprocessingcomparison with Czech Web Corpussemantic relatednessWeb-based Data Collection of SpeechText AnnotationAnnotation Evaluation, Statistical TrainingSemantic Role Labellingword sense disambiguation, semantic relations, ontologymorphosyntactic tagging, lemmatization, chunkingMorphosyntactic taggingcleaning web pages, tokenization, sentence splitting, morphological analysis, Part-of-Speech tagging, phrase chunking, Named Entity recognition, lemmatization, multiword recognitionCleaning-up lexical resoucesAnaphora, CoreferenceCreating Verb LexiconLemmatizingModeling Social Phenomena in DiscourseMachine ReadingPlagiarismManual correction of named-entity pre-annotated corpusWordnetsFrameNet-based Shallow Semantic ParsingAnswer ValidationPropbank instance creationtransliteration similarity estimationAnnotation and grounding of toponyms in image captionsManual annotation and grounding of toponyms in image captionsEvaluation tool used for text alignment and annotationDesigned to geo-tag image captions to aid image indexing and retrievalThe resource is valuable for both linguists and NLP researchersmapping language resources and usersWeb-based applications, Information ExtractionParts of Speech Taggingquerying wordnet-like lexical databasescontrastive studies, language teaching, translation teachingPropbank frameset editorPropbank instance editorExperimental Research, Language learning, modellingmany purposesallsupporting infrastructureSpeaking avatarText analyticsTagging of Early Modern German corpusAutomatic annotation, manual correction of annotationCorpus-linguistic investigations, comparative studies with other historical corporamultiple uses, primarily meant as translation toolsmultiple usesGeneral metadata, potentially useful for a wide rage of applicationsstandardized, general accessible data category descriptionConverting linguistic data into several formatsstandardized format for linguistic datastandardized representation of fetauresDictionary definitionchunking, parsingdevelopment of plug-inHuman communication analysis, emotion recognition, theatrical improvisation analysisAnaysis of cognitive processes in the mathematics classroomDominanceEstimationa proof of concept for the proposed algorithmsGaze estimationGeneral LT infrastructuredocument classifcation; emotion recognition; multimedia document processing; language identification; person identification; voice control; web services; forensicsCorpus generationGenerate structured values (not just named entities)As a framework for NLP module and pipeline development / hostingAvatar modelling and rigging for signingMultiple usage scenariosCorpus-based language description (grammar) and anlaysisStudy of semantic relations between words and signssecond language learningcommon annotation scheme for Sign Language unitsLSF descriptionSign Language descriptionSign Language Corpora AnnotationTagging and lemmatizationinteractive labeling to contruct statistical face modelsDevelopment of corporatext entryMyanmar Word SegmentationName MatchingMachine Learning, Automatic Term Recognitionpart-of-speech annotationall kinds of research usagesLRT descriptionComputational Lexicographyspell-checker evaluationPOS-tagging, parsing, corpus linguisticsCorpus constructionTokenisation, PoS tagging, Parsingcan be both WSD and information extractionemotion recognition, annotation, analysis of meaning and disambiguation of affective terms according to the recorded affective speechSentiment Analysis, Network AnalysisDependency Parsing OptimizationDependency Parser GeneratorTranslation error analysis evaluationModeling lexical resources, NLP lexiconsResource-based NLP applications, e.g. Word Sense DisambiguationLanguage DocumentationAnnotation standard/guidelines, validation schema, conversion toolsDiachronic studieslanguage evaluationLanguage learningscientific research - language acquisitionTreebankLinguistic and NLP ResearchString comparisoneye trackingAutomatic Phonetic Transcription and Segmentationcorpus compilationLexicography; CALL;POS tagging, LemmatizationSemanticsdictionary look-upCompositionality of MWEsNatural Language Analysis and Production, Tagging, Machine Translation, Context Sensitive Correction, etc.writing processClustering and statistical calculationsTerm extractionTemporal Extraction and NormalizationArchivingSentiment Analysis, Opinion ClassificationText Normalizationresearch: phonetics, psycholinguistics,artificial intelligenceeducation, lexicography, contrastive studiesTemporal Processingmorphology and syntaxcomplex evaluation of machine translationconstituent-to-dependency conversionAnnotation of various phenomenaparsing, parallel parsing, machine translation, coreference resolution, anaphora resolution, natural language generation, lexical acquisitionCross-linguistic investigationMorphological analysis/synthesisMachine Learning, Shalow Parsing, Named Entity Recognition, Word Sense DisambiguationTimeML annotationStimuliLanguage Learingknowledge acquisition and text production in the environmental domainsemantic annotationSemantic role labeling, Verb sense disambiguationSubsentential alignment and terminology extractionAnnotation and EducationalCould be used in a variety of applications.Transliteration dataHistorical Linguistics, Pragmatics, General Linguisticsbasically any NLP application/corpus using linguistic annotations can make us of this resourceCorpus-based language descriptionCorpus Query Systemfor text search and LT projectstarget of dictionary look-upSupports multi-disciplinary researchName Entities Annotation, WSD Annotation, Relation Annotation, Anaphora AnnotationText alignmentSyntactic Researchany natural language processing task on RomanianProsodic analysisNeologism detectionGeneralNatural Language Processingtackling interoperability issues within UIMA workflowsSemantics and Pragmatics; cross-linguistic and diachronic studiesNLP Tool provided as serviceLogical MetonymyDialect researchMultipurpose research corpusResource discoveryText simplificationterminological resourceSMSMultimodal Communication AnalysisThe annotations provided by the resource may be used by any application.Annotation of multimodal multimedia recordingsNumerous: sentiment, NE extraction, IR etcNamed Entity Disambiguationannotation processingTranslation StudiesMetadata editorError Detection and CorrectionInteraction between lexical resourcesInternet language normalizationWeb-based Data CollectionTrain Deep Syntactic ParsersSyntactic Parsing, Discourse ParsingIdiom detection / classificationSentiment Analysis, Event DetectionContrastive semantic studies, teaching aidsSyntax, grammarparser evaluationlanguage proofingthesaurus buildingSentiment LexiconMorphological Taggingmultipurposeconverts constituent trees to dependency treesTerm selectionFocused crawlerText Complexity AnalysisAnnotation, language modelling, encoding detection etc.Term candidates validationsyntax, grammar, documentationHistorical LinguisticsDatabase schemaCorpus-based language description, Error Analysissemantic class definitionIntra Chunk Dependency ParsingDistributional Semanticssyntaxico-semantic annotationsentence splitting, tokenisation, syntax analysissyntactic analysissolve mixed integer linear programmany: lexicography, linguistic research, language teachng, WSD, ...Text processing TerminologyStatistical phrase alignmentText parsing and Tagging Pronunciation AssessmentSpoken Term DetectionTo develop speech recognition system tailored for the persons disabled in articulationParsing Evaluation (NEW VALUE)Text analysisTaggerPOS tagging, domain adaptationTaxonomy Generation (NEW VALUE)Subcategorization Frames Extraction (NEW VALUE)Unsupervised part-of-speech tagging (NEW VALUE)Shallow parsing (NEW VALUE)sqlite database fileweb crawledWrittenWrittenNot ApplicableModality independentSpeechWritten, annotated with grammatical errorsMultimodal/MultimediaSign Languagesearch results can be product of many sourcesWritten and SpokensoftwareSpeech and WrittenWritten, Transcribed SpeechTranscribed SpeechSpeech Transcriptelectronic text, handwrittenenvironmental noises: platform, shopping mall, subway, in-car highwayMicrophone and accelerometer recordings of speechAAudio Data in wav formatelectropalatography, speechlexical databaselexical semantic databasevarious modalitiesWritten and SpeechTranscribed spoken language, Child languageannotation guidelineswritten and verbatim notes of spoken languageStatisticsWritten, spokenWritten, Spoken (ASR output, or manually transcribed)Written, Oral, and TranscribedJudgments represented by numbers on specific sentencescollection of resources with different modalitiesprogramFormal rule language and OWLuse via WWWTranscription of Speechunderspecified semantic structuresSpeech, WrittenWritten; also contains movements in virtual environmentGene interaction matrixDictionaryOpen sourceSpeech + semantics of scenespeech as well as patients' meta-data (civil status and clinical information)Multilingual Ontology MatchingWritten and spoken transcriptAny modalityWritten, SpeechMultimedia, multisensorial, language and sensorimotor measurementsCurrently it is for textual data but the spatial representation is to be considered in a multimodal environment in the future.transliteration dataTextSpeech based lexiconSpeech and TextVisualComputer Visionwritten and sign languageRelaxNG schema, Pythonmany different modalitiesmultimodalmulti-modal - seperate parts for voice, face, body languageSpeech/WrittenmultipleMultimodal AND Sign LanguagewordnetPOS tagged corpusNot ApplicableWrittenNot ApplicableWrittenNot ApplicableModality independentSpeechWritten, annotated with grammatical errorsMultimodal/MultimediaSign Languagesearch results can be product of many sourcesWritten and SpokensoftwareSpeech and WrittenWritten, Transcribed SpeechTranscribed SpeechSpeech Transcriptelectronic text, handwrittenenvironmental noises: platform, shopping mall, subway, in-car highwayMicrophone and accelerometer recordings of speechAAudio Data in wav formatelectropalatography, speechlexical databaselexical semantic databasevarious modalitiesWritten and SpeechTranscribed spoken language, Child languageannotation guidelineswritten and verbatim notes of spoken languageStatisticsWritten, spokenWritten, Spoken (ASR output, or manually transcribed)Written, Oral, and TranscribedJudgments represented by numbers on specific sentencescollection of resources with different modalitiesprogramFormal rule language and OWLuse via WWWTranscription of Speechunderspecified semantic structuresSpeech, WrittenWritten; also contains movements in virtual environmentGene interaction matrixDictionaryOpen sourceSpeech + semantics of scenespeech as well as patients' meta-data (civil status and clinical information)Multilingual Ontology MatchingWritten and spoken transcriptAny modalityWritten, SpeechMultimedia, multisensorial, language and sensorimotor measurementsCurrently it is for textual data but the spatial representation is to be considered in a multimodal environment in the future.transliteration dataTextSpeech based lexiconSpeech and TextVisualComputer Visionwritten and sign languageRelaxNG schema, Pythonmany different modalitiesmultimodalmulti-modal - seperate parts for voice, face, body languageSpeech/WrittenmultipleMultimodal AND Sign LanguagewordnetPOS tagged corpus