About: Corpus linguistics   Sponge Permalink

An Entity of Type : owl:Thing, within Data Space : 134.155.108.49:8890 associated with source dataset(s)

Corpus linguistics is the study of language as expressed in samples (corpora) or "real world" text. This method represents a digestive approach to deriving a set of abstract rules by which a natural language is governed or else relates to another language. Originally done by hand, corpora are largely derived by an automated process, which is corrected. The core of a corpus is the derivation of a set of Part-of-speech tags, representing a formal overview of the various types of words and word-relationships in a given language.

AttributesValues
rdfs:label
  • Corpus linguistics
rdfs:comment
  • Corpus linguistics is the study of language as expressed in samples (corpora) or "real world" text. This method represents a digestive approach to deriving a set of abstract rules by which a natural language is governed or else relates to another language. Originally done by hand, corpora are largely derived by an automated process, which is corrected. The core of a corpus is the derivation of a set of Part-of-speech tags, representing a formal overview of the various types of words and word-relationships in a given language.
sameAs
dbkwik:cogling/pro...iPageUsesTemplate
abstract
  • Corpus linguistics is the study of language as expressed in samples (corpora) or "real world" text. This method represents a digestive approach to deriving a set of abstract rules by which a natural language is governed or else relates to another language. Originally done by hand, corpora are largely derived by an automated process, which is corrected. The core of a corpus is the derivation of a set of Part-of-speech tags, representing a formal overview of the various types of words and word-relationships in a given language. Computational methods had once been viewed as a holy grail of linguistic research, which would ultimately manifest a ruleset for natural language processing and machine translation at a high level. Such has not been the case, and since the cognitive revolution, cognitive linguistics has been largely critical of many claimed practical uses for corpora. However, as computation capacity and speed have increased, the use of corpora to study language and term relationships en masse has gained some respectability. The corpus approach runs counter to Noam Chomsky's view that real language is riddled with performance-related errors, thus requiring careful analysis of small speech samples obtained in a highly controlled laboratory setting. Corpus linguistics does away with Chomsky's competence/performance split; adherents believe that reliable language analysis best occurs on field-collected samples, in natural contexts and with minimal experimental interference.
Alternative Linked Data Views: ODE     Raw Data in: CXML | CSV | RDF ( N-Triples N3/Turtle JSON XML ) | OData ( Atom JSON ) | Microdata ( JSON HTML) | JSON-LD    About   
This material is Open Knowledge   W3C Semantic Web Technology [RDF Data] Valid XHTML + RDFa
OpenLink Virtuoso version 07.20.3217, on Linux (x86_64-pc-linux-gnu), Standard Edition
Data on this page belongs to its respective rights holders.
Virtuoso Faceted Browser Copyright © 2009-2012 OpenLink Software