Difference between revisions of "Ling Luo 2015-08-31"

From cslt Wiki
Jump to: navigation, search
(Works in the past:)
(Works in this week:)
Line 29: Line 29:
 
== Works in this week: ==
 
== Works in this week: ==
  
word similarity(ws):
+
word similarity:
 
try to use different similarity calculation method  
 
try to use different similarity calculation method  
  
named entity recognition(ner)
+
named entity recognition
  
 
focus on cnn
 
focus on cnn

Revision as of 02:21, 2 September 2015

Works in the past:

1.Finish training word embeddings via 5 models :

using EnWiki dataset(953M):

CBOW,Skip-Gram

using text8 dataset(95.3M):

CBOW,Skip-Gram,C&W,GloVe,LBL and Order(count-based)

2.Use tasks to measure quality of the word vectors with various dimensions(10~200):

word similarity(ws)

the TOEFL set:small dataset

analogy task:9K semantic and 10.5K syntactic analogy questions

text classification:IMDB dataset——pos&neg,use unlabeled dataset to train word embeddings

sentence-level sentiment classification (based on convolutional neural networks)

part-of-speech tagging

Works in this week:

word similarity: try to use different similarity calculation method

named entity recognition

focus on cnn