- 인공지능 기계학습(머신러닝), 자연어 처리( NLP( Natural Language Processing))



* pytorch /pytorch (파이토치)

https://github.com/pytorch/pytorch - 42.2k

    -  v1.6.0 , 2020.07.29

    - 2016년 페이스북이 발표

https://pytorch.org/

    - 지원 언어 : Python, C++, Julia

    - 모태인 torch 라이브러리는 2017년에 개발 중단됨

    - 테슬라, 우버의 자율주행시스템에 사용  



//-------------------------------------------

< PyTorch를 이용한 자연어처리 라이브러리 >  

 

*  allenai /allennlp 

https://github.com/allenai/allennlp - 9.7k

    - v2.0.1 , 2021/02

 

 

*  flairNLP /flair 

github.com/flairNLP/flair - 10.0k

    - v0.7 , 2020/12



//------------------------------------------------------------------------------

AllenNLP 사용 방법 >

 

    - 라이브러리 설치

pip install allennlp

 

pip install allennlp-models



    - 감정 분석(Sentiment Analysis) 예제

        - https://demo.allennlp.org/sentiment-analysis

 

pip install allennlp==1.1.0 allennlp-models==1.1.0

 

//------------------

// sen.py

 

from allennlp.predictors.predictor import Predictor

 

import allennlp_models.classification

 

predictor = Predictor.from_path(

    "https://storage.googleapis.com/allennlp-public-models/basic_stanford_sentiment_treebank-2020.06.09.tar.gz")

 

ret = predictor.predict(

    sentence="so unremittingly awful that labeling it a dog probably constitutes cruelty to canines")

print(ret)



    //-----------------------

    - 에러 발생

error loading _jsonnet (this is expected on Windows), treating

 

    - 해결 방법

pip install jsonnetbin




    - 소스 실행 결과

{'logits': [-1.9308886528015137, 1.9371905326843262], 'probs': [0.020470665767788887, 0.9795293211936951], 'token_ids': [44, 11708, 739, 11, 18763, 13, 4, 1109, 367, 18764, 6030, 8, 18765], 'label': '0', 'tokens': ['so', 'unremittingly', 'awful', 'that', 'labeling', 'it', 'a', 'dog', 'probably', 'constitutes', 'cruelty', 'to', 'canines']}

 

        - 결과값 해석 : probs값은 첫번째는 긍정 , 두번째는 부정의 비율





//-------------------------------------------------------------------

AllenNLP 데모

https://demo.allennlp.org/



< 질문에 응답 > 

* 독해 (Reading Comprehension)           <=====

    - 주어진 문장에 대한 질문에 답하기(주제어 선정)



* 그림에 대한 질문에 답하기(Visual Question Answering)

    

 

//--------------------------

< 문장에 주석 달기 (Annotate a sentence) >

* 고유명사 식별(Named Entity Recognition)



* 개방형 정보 추출 (Open Information Extraction)

    단일 술어와 임의 개수의 인수로 구성된 명제 목록을 추출



* 감정 분석(Sentiment Analysis)        <=====

    - 문장이 긍정적인지 부정적인지 판단



* 종속성 구문 분석(Dependency Parsing)

    - 문법적 분석을 통해서 머리 단어와 그외 단어들의 관계를 분석



* 구성 구문 분석(Constituency Parsing)

    - 문장을 하위 구문 또는 구성 요소로 분리



* 의미론적 역할 레이블링(SRL, Semantic Role Labeling)

    - 각 단어(구문)의 의미적 역할을 구분

    - 기본적 질문에 대답할수 있게 하는 준비 과정




//-------------------

< 구절에 주석 달기 (Annotate a passage) >

* 상호 참조 찾기( Coreference Resolution)

    - 같은 구문을 참조하는 것들을 분류

    - 문서 요약, 질문 답변 및 정보 추출 등에 활용됨



//-------------------

< 의미 분석(Semantic parsing) >

* 테이블 의미 분석(WikiTableQuestions Semantic Parser)

    - 통계자료 분석해서 질문에 대답



* Cornell NLVR Semantic Parser

    - NLVR(Natural Language for Visual Reasoning)

        - http://lil.nlp.cornell.edu/nlvr/

        - 사진에 대해 기술한 문장이 참인지를 파악하는 기술



* 문장을 SQL 코드로 변환(Text to SQL (ATIS))

    



//-------------------

< 기타>

 

* 텍스트 한정(Textual Entailment)           <=====

    - 전제에 대해서 가설이 맞는지 판단



* 언어 모델링(Language Modeling)        <=====

    - 다음 단어 예측



* 가려진 언어 모델링(Masked Language Modeling)         <=====

    - 빈칸 채우기




반응형
Posted by codens