Andrew Carlson - Toward an Architecture for Never-Ending Language Learning (2010)
History /
Edit /
PDF /
EPUB /
BIB /
Created: June 9, 2017 / Updated: November 2, 2024 / Status: finished / 2 min read (~271 words)
Created: June 9, 2017 / Updated: November 2, 2024 / Status: finished / 2 min read (~271 words)
- A knowledge base of first-order predicate and instance is built by crawling the web
- Using 4 components, NELL extracts textual patterns, queries the Internet for instances of known predicates, classify these instances using a logistic regression model and learns new first-order relations
- After running for 67 days, it learned over 240000 facts with an estimated precision of 74% (90%, 71% and 57% respectively over 3 evaluations)
- By a "never-ending language learner" we mean a computer system that runs 24 hours per day, 7 days per week, forever, performing the following two tasks each day:
- Reading task: extract information from web text to further populate a growing knowledge base of structured facts and knowledge
- Learning task: learn to better read each day than the day before, as evidenced by its ability to go back to yesterday's text sources and extract more information more accurately
- NELL uses four subsystem components:
- Coupled Pattern Learner: A free-text extractor which learns and uses contextual patterns like "mayor of X" and "X plays for Y" to extract instances of categories and relations
- Coupled SEAL: A semi-structured extractor which queries the Internet with sets of beliefs from each category or relation, then mines lists and tables to extract novel instances of the corresponding predicate
- Coupled Morphological Classifier: A set of binary $L_2$-regularized logistic regression models - one per category - which classify noun phrases based on various morphological features (words, capitalization, affixes, parts-of-speech, etc.)
- Rule Learner: A first-order relational learning algorithm similar to FOIL, which learns probabilistic Horn clauses
- Carlson, Andrew, et al. "Toward an Architecture for Never-Ending Language Learning." AAAI. Vol. 5. 2010.