코드 분석 - Task 3

0. Shell 파일 분석

1) run_preprocess_gp2.sh

1> argument의 개수에 따른 argument 할당 방식

#!/bin/bash
if [[ $# -lt 1 ]]
then
    PATH_DIR=$(realpath .)
    PATH_DATA_DIR=$(realpath ../data)
else
    PATH_DIR=$(realpath "$1")
    PATH_DATA_DIR=$(realpath "$2")
fi

2> 전처리 방식의 분류

DOMAIN : Furniture vs Fashion

Multimodal : Multimodal vs non-multimodal

dataset 종류 : train vs dev vs devtest (vs test)

※ 그래서 해당 파일처럼 여러가지의 경우의 수가 있을 수 있다.

2) run_train_gpt2.sh

1> argument

#!/bin/bash
if [[ $# -lt 1 ]]
then
    PATH_DIR=$(realpath .)
else
    PATH_DIR=$(realpath "$1")
fi

파일 경로를 의미

2> Train 방식의 분류

DOMAIN : Furniture vs Fashion

Multimodal : Multimodal vs text-only

※ 그래서 총 4가지 방식

3) run_generate_gpt2.sh

(train과 같은 형태를 보인다.)

1> argument

#!/bin/bash
if [[ $# -lt 1 ]]
then
    PATH_DIR=$(realpath .)
else
    PATH_DIR=$(realpath "$1")
fi

파일 경로를 의미

2> Train 방식의 분류

DOMAIN : Furniture vs Fashion

Multimodal : Multimodal vs text-only

※ 그래서 총 4가지 방식

4) run_evaluate_gpt2.sh

(train과 같은 형태를 보인다.)

1. run_preprocess_gp2.sh

gpt2_dst.scripts.preprocess_input 파일 실행

1) shell 파일 argument

1> --input_path_json

2> --output_path_predict

3> --output_path_target

4> --output_path_special_tokens

5> --len_context

6> --use_multimodal_contexts

2) convert_json_to_flattened

SIMMC data를 GPT-2 format으로 변환

3. run_train_gpt2.sh

gpt2_dst.scripts.run_language_modeling 파일 실행

GPT, GPT-2, BERT, RoBERTa 등의 모델에 대한 Fine Tuning

1> Custom Dataset

2> tokenize

3> train

4> evaluate

등의 작업을 진행

4. run_generate_gpt2

'NLP 연구실 업무 > DSTC9 - SIMMC' 카테고리의 다른 글

코드 분석 - Task 1, 2 (0)	2020.08.16
[Background] Situated and Interactive Multimodal Conversations (0)	2020.08.06

컴수 머신러닝

코드 분석 - Task 3

0. Shell 파일 분석

1. run_preprocess_gp2.sh

3. run_train_gpt2.sh

'NLP 연구실 업무 > DSTC9 - SIMMC' 카테고리의 다른 글

티스토리툴바

코드 분석 - Task 3

0. Shell 파일 분석

1. run_preprocess_gp2.sh

3. run_train_gpt2.sh

'NLP 연구실 업무 > DSTC9 - SIMMC' 카테고리의 다른 글

'NLP 연구실 업무/DSTC9 - SIMMC' Related Articles

티스토리툴바