기본 딥러닝

딥러닝을 위한 사전 절차

필요한 라이브러리

기본 골자

옵티마이저

Validation_split , batch_size

딥러닝을 위한 사전 절차

•

X 데이터에 대한 결측치 해결, 스케일링, 가변수화 필수

•

모델링에 따라 타겟 가변수화(다중 분류 → y 가변수 필수(LabelEncoding / One-Hot Encoding))

필요한 라이브러리

from keras.models import Sequential # 모델링을 위한 함수
from keras.layers import Dense, Input # layer를 만들기 위한 함수
from keras.backend import clear_session # 메모리 정리를 위한 함수
Python
복사

기본 골자

# 스케일링
scaler = MinMaxScaler()
x_train = scaler.fit_transform(x_train)
x_val = scaler.transform(x_val)


# 메모리 정리
clear_session()

nfeatures = x_train.shape[1] #num of columns
nfeatures

# 모델 선언
model = Sequential( [Input(shape =(nfeatures,)),
                     Dense(1)]  )

# 모델요약
model.summary()


# 모델 컴파일
model.compile(optimizer='adam', loss='mse')

# 학습
model.fit(x_train, y_train)

# 예측
pred = model.predict(x_val)

# 검증
print(f'RMSE  : {root_mean_squared_error(y_val, pred)}')
print(f'MAE   : {mean_absolute_error(y_val, pred)}')
print(f'MAPE  : {mean_absolute_percentage_error(y_val, pred)}')
Python
복사

옵티마이저

오차(Loss)를 최소화 하도록 가중치를 업데이트 하는 역할

기본 골자는 경사하강법 인데, 가중치에 대한 손실 함수의 최솟값의 위치(미분한 값이, 즉 기울기가 0에 가까운 곳)를 찾기 위해 미분한 값들의 방향과 크기를 통해 가중치를 지속적으로 조정해나감

한계점은 손실함수의 모양에 따라 지역 최솟점이 생길 수 있다는 점이다. 따라서 전역 최소점이 딥러닝 연산을 수행할때마다 항상 일치하지 않을 가능성이 있다.

옵티마이저들은 지속해서 연구중이고, 현재까지 다음의 모델들이 있다.

GD(gradient decent)	모든 데이터들을 한땀한땀 다 확인
SGD(Stochastic gradient decent)	모든 데이터들보다 좀 들보면서 빨리 판단하기	모멘텀 방법과 결합하면 큰 이미지 데이터셋이나 정밀한 최적화가 필요할 때 유리 - 학습률을 튜닝할 필요 있음 - 큰 데이터셋과 CNN에서 성능이 좋음
모멘텀	기존에 사용한 기울기의 일정 비율을 현재 기울기에 반영해서 방향을 업데이트
Adagrad	안가본 곳은 빠르게 훑고 많이 가본 곳은 세밀히 탐색	학습률을 각 매개변수에 맞춰 조정 - 희소한 피처를 가지는 데이터셋에 유리 ex. 자연어 처리
RMSProp	Adagrad에서 보폭을 줄일 때 이전 맥락 상황을 봐가면서 탐색	학습 속도를 자동으로 조정한다. - RNN모델과 같이 변동이 심한 기울기에서 성능이 좋음
Adam	RMSProp + 모멘텀	SGD + 모멘텀 + RMSProp을 결합한 방식으로, 일반적으로 빠른 수렴 속도와 뛰어난 성능을 보여 많은 모델에서 기본 옵티마이저로 쓰임 - 학습률을 자동 조정 - 초기 학습이 빠름 - 처음 실험 시 주로 선택됨
AdamW	Adam의 개선된 버전	- 더 일반화된 성능 제공 - Adam과 비슷한 속도와 성능 + 더 나은 일반화 성능

Validation_split , batch_size

•

Validation_split

학습 중, 모델의 성능을 검증하기 위해 일부 데이터를 검증용으로 사용하기 위함

0.2 비율을 많이 사용하며, 이 검증 데이터는 학습 중 손실이나 정확도를 확인하기 위해 쓰임

•

Batch_size

한 번의 학습 단계에서 모델이 처리할 샘플 데이터의 수

학습 데이터를 여러 배치로 나누어 처리하는데, 한 배치당 처리할 샘플 데이터의 수를 지정한다고 보면 된다.

배치 크기가 클수록 메모리 사용량은 증가하지만, 모델의 성능이 더 안정적으로 학습된다.

두 개념은 서로 다른 목적을 가지지만, 일반적으로 효과적인 학습을 위해 동시에 설정되는 경우가 많다.

.history

모델을 학습할때 뒤에 .history 속성을 사용하면, 모델이 학습하면서 나온 오차 기록(train, valid)에 대한 정보를 얻을 수 있다.

이 정보로 Epoch 에 따른 Loss 손실함수 그래프 → 학습 곡선 을 그릴 수 있다.

학습 곡선 해석하기

•

학습이 덜 된 곡선 그래프

◦

valid, train 모두 완만해지지 않고 계속 줄어들고 있는 경우

▪

epoch를 늘리거나 , learning_rate 를 높인다.

•

train_err 가 들쑥날쑥할 경우

◦

가중치 조정이 세밀하지 않은 경우다.

▪

learning_rate를 줄인다.

•

과적합

◦

train_err는 줄어드고 안정되고 있는데, valid_err는 어느 순간부터 커진다.

▪

Epoch 수를 줄인다.

활성화 함수

활성화 함수는

Hidden Layer 에서 선형 함수 → 비선형 함수로 변환(relu)

Output layer에서 결과 값 → 다른 값으로 변환(주로, Sigmoid, Softmax)

ReLU	입력이 0보다 크면 그대로 반환하고, 0보다 작으면 0을 반환하는 함수	- 간단하며 효과적 - 연산이 빠르고, 기울기 소실 문제가 적음 - 가장 많이 사용되며, CNN과 같은 합성곱 신경망에서 주로 사용
Leaky ReLU	ReLU의 변형 입력이 0보다 작을 때 아주 작은 기울기(일반적으로 0.01)를 가짐	- ReLU의 문제(죽은 뉴런)를 완화 - ReLU 다음으로 많이 사용됨. - 추가적인 파라미터 설정 필요
ELU	입력이 0보다 크면 그대로 반환하고, 0보다 작을 때는 지수 함수로 변환	- 죽은 ReLU문제를 완화. - 연산이 상대적으로 느림 - Leaky ReLU와 유사하게 많이 사용.
Sigmoid	입력 값을 0 ~ 1 사이의 값으로 변환	- 확률적 출력이 필요한 이진 분류에서 유용 - 기울기 소실 문제 → 깊은 네트워크에서 거의 사용 X - 은닉층에서는 사용 X, 출력층에서 확률 계산용으로 주로 사용
Tanh	입력 값을 -1 ~ 1 사이의 값으로 변환	- ReLU가 일반화되기 전에 많이 사용되었음 - 기울기 소실 문제가 발생할 수 있음 - Sigmoid처럼 은닉층에서는 사용 X, 가끔 RNN에서 쓰임 - 출력층에서 쓰임
Swish	Google에서 제안한 함수 입력 값에 Sigmoid를 곱한 형태	- ReLU와 유사한 장점. 더 깊은 네트워크 학습 성능을 개선하는 데 효과적 - 상대적으로 계산이 복잡 - 새로운 네트워크 구조나 연구에서 점차 많이 사용됨.

은닉층 노드 수

은닉층의 노드 수를 설정하는 것은 모델 성능에 중요한 영향을 미치는 요소다.

일반적인 접근 방법과 기준은 아래와 같다.

1. 입력 및 출력 노드 수 기반 설정

•

입력층과 출력층의 노드 수를 기반으로 중간 값을 잡거나 그보다 큰 값을 설정하기도 합니다.

•

예를 들어, 입력층이 64개, 출력층이 10개 노드를 가진다면 은닉층에 32, 64, 128 등의 노드 수를 설정해 볼 수 있습니다.

2. 점진적인 증가 방식

•

딥러닝에서는 은닉층의 노드 수를 너무 적게 설정하면 학습에 필요한 정보를 충분히 추출하지 못할 수 있습니다. 반면, 너무 많으면 과적합(overfitting) 위험이 커집니다.

•

따라서, 초기에 작은 값부터 시작해서 점차 노드 수를 늘리면서 성능을 비교하는 방식을 씁니다.

•

이때 보통 32, 64, 128, 256, 512 등 2의 거듭제곱 수로 설정해 보며 최적의 수치를 찾아갑니다.

3. 층이 깊어질수록 노드 수 감소 (피라미드 형태)

•

네트워크가 깊어질수록 점진적으로 노드 수를 줄이는 피라미드 형태를 자주 사용합니다.

•

예를 들어, 첫 번째 은닉층을 512로 시작하면 두 번째는 256, 세 번째는 128, 네 번째는 64 등으로 줄여 나갑니다.

•

이는 고차원 정보를 상위층에서 학습하고, 깊이 들어갈수록 더 추상적인 저차원 정보를 학습하기 좋도록 하기 위함입니다.

4. 실험적 최적화 (하이퍼파라미터 튜닝)

•

Grid Search나 Random Search, 혹은 HyperOpt, Optuna 같은 최적화 라이브러리를 활용해 최적의 노드 수를 찾기도 합니다.

•

노드 수는 다른 하이퍼파라미터와 함께 튜닝해야 할 요소 중 하나로, 모델과 데이터에 따라 최적의 값을 실험적으로 도출합니다.

5. 경험적 추천 값

•

단순한 데이터일 경우: 32, 64, 128 등 적당한 노드 수를 가진 하나 또는 두 개의 은닉층이 적합할 수 있습니다.

•

복잡한 데이터일 경우: 수백에서 천 개 이상의 노드를 가진 여러 은닉층이 필요할 수 있습니다.

•

이미지, 자연어 처리와 같이 많은 특징을 가진 데이터일수록 더 많은 노드를 가진 모델을 사용하는 편입니다.

회귀

분류

이진 분류

•

Loss Function : binary_crossentropy

•

hidden layer 활성함수 : ReLU(일반적)

•

output layer 활성함수 : Sigmoid

•

함수 결과값 후속처리

◦

sigmoid 함수 결괏값도 확률로 나오기 때문에 0 또는 1값으로 변환해줘야 함.

◦

np.where(조건, 1, 0)

pred = model.predict(x_val)
pred_1 = np.where(pred >= 0.5, 1, 0)
Python
복사

다중 분류

•

Loss function 

◦

Y 전처리 → Label Encoding → sparse_categorical_crossentropy

◦

Y 전처리 → One-Hot Encoding → categorical_crossentropy

두 방식 수학적 계산은 동일

사이킷런에서 활용에 있어 조금의 차이가 있음.

ex1. Label Encoding

from sklearn.preprocessing import LabelEncoder

int_encoder = LabelEncoder()

data['Species_encoded'] = int_encoder.fit_transform(data['Species'])

# 인코딩 범주 조회
print(int_encoder.classes_)
Python
복사

ex2. One-Hot Encoding

from sklearn.preprocessing import OneHotEncoder

oh_encoder = OneHotEncoder()

# 원핫인코딩을 이용해 변환할 경우, 2차원으로 input
encoded_y1 = oh_encoder.fit_transform(data[['Species']])

print(encoded_y1.toarray())
Python
복사

•

hidden layer 활성함수 : ReLU(일반적)

•

활성함수 : softmax

•

함수 결과값 후속처리

◦

softmax의 결과는 각 class별 확률값이기 때문에, 클래스값으로 변환해줘야 한다.

◦

np.argmax(axis=1) : 가장 큰 값의 인덱스 반환

pred = model.predict(x_val)
pred_1 = pred.argmax(axis=1) # 열방향으로 가장 큰 값의 인덱스 반환
Python
복사

실습 - 회귀

•

활용할 데이터 : Carseats.csv

import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt

from sklearn.model_selection import train_test_split
from sklearn.metrics import *
from sklearn.preprocessing import MinMaxScaler

from keras.models import Sequential
from keras.layers import Dense, Input
from keras.backend import clear_session
from keras.optimizers import Adam
Python
복사

학습 곡선 함수 정의

def dl_history_plot(history):
		plt.figure(figsize=(10,6))
		plt.plot(history['loss'], label='train_err', marker='.')
		plt.plot(history['val_loss'], label='val_err', marker='.')
		
		plt.ylabel('Loss')
		plt.xlabel('Epoch')
		plt.legend()
		plt.grid()
		plt.show()
Python
복사

path = 'https://raw.githubusercontent.com/DA4BAM/dataset/master/Carseats.csv'
data = pd.read_csv(path)
data.head()
Python
복사

Sales	CompPrice	Income	Advertising	Population	Price	ShelveLoc	Age	Education	Urban	US
0	9.50	138	73	11	276	120	Bad	42	17	Yes
1	11.22	111	48	16	260	83	Good	65	10	Yes
2	10.06	113	35	10	269	80	Medium	59	12	Yes
3	7.40	117	100	4	466	97	Medium	55	14	Yes
4	4.15	141	64	3	340	128	Bad	38	13	Yes

# x, y 분리
target = 'Sales'
x = data.drop(target, axis=1)
y = data[target]

# 데이터 분할 : train / val
x_train, x_val, y_train, y_val = train_test_split(x, y, test_size=.2, random_state=0)


# Scaling
scaler = MinMaxScaler()
x_train = scaler.fit_transform(x_train)
x_val = scaler.transform(x_val)
Python
복사

# 딥러닝 모델 설계
clear_session()
n_features = x_train.shape[1]

model = Sequential([Input(shape=(n_features,)),
										Dense(50, activation='relu'),
										Dense(50, activation='relu'),
										Dense(50, activation='relu'),
										Dense(10, activation='relu'),
										Dense(1)])
										

model.summary()
Python
복사

# 모델 컴파일
model.compile(optimizer=Adam(learning_rate=0.01), loss='mse')

history = model.fit(x_train, y_train, epochs=50, validation_split=.2, verbose=0).history
Python
복사

dl_history_plot(history)
Python
복사

# 검증
pred1 = model.predict(x_val)

print(f'RMSE : {root_mean_squared_error(y_val, pred1)}')
print(f'MAE : {mean_absolute_error(y_val, pred1)}')
print(f'MAPE : {mean_absolute_percentage_error(y_val, pred1)}')
Python
복사

RMSE : 1.4091517041027484 MAE : 1.1444624982476232 MAPE : 163947921119641.8

추가 실험 1 - 히든 레이어의 노드 수에 따른 validation 성능 비교

# 노드 수에 따른 validation 성능
def modeling_test1(node):
		
		clear_session()
		model = Sequential([Input(shape=(n_features,)),
												Dense(node, activation='relu'),
												Dense(1)])
												
	
		model.compile(optimizer=Adam(learning_rate=0.01), loss='mse')
		model.fit(x_train, y_train, epochs=50, verbose=False)
		
		pred = model.predict(x_val)
		mae = mean_absolute_error(y_val, pred)
		
		return mae
Python
복사

from tqdm.auto import tqdm

nodes = [2, 5, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150]
result = []
for n in tqdm(nodes):
		result.append(modeling_test1(n))
		
		

# 노드별 validation 의 mae 시각화
plt.plot(nodes, result)
plt.grid()
plt.show()
Python
복사

추가 실험 2 - 히든 레이어의 수에 따른 validation 성능 비교

# 히든 레이어 수에 따른 validation 성능
def modeling_test2(layer):
		
		# 레이어 리스트 만들기
		# 레이어 수 만큼 리스트에 레이어 추가
		
		clear_session()
		
		# 첫 번째 레이어에는 input_shape 필요
		layer_list = [Input(shape=(nfeatures,)),
									Dense(10, activation='relu')]
									
		
		# 주어진 레이어 수에 맞게 레이어 추가 : 2개서부터 append
		for i in range(2, layer):
				layer_list.append(Dense(10, activation='relu'))
		
		
		# Output layer 추가하고 모델 선언
		layer_list.append(Dense(1))
		model = Sequential(layer_list)
		
		# summary
		print(model.summary())
		
		# compile + train
		model.compile(optimizer=Adam(learning_rate=0.01), loss='mse')
		model.train(x_train, y_train, epochs=50, verbose=False)
		
		# predict
		pred = model.predict(x_val)
		mae = mean_absolute_error(y_val, pred)
		
		return mae
Python
복사

layers = list(range(1,11)) # layer : 1 ~ 10
result = []
for l in layers:
		result.append(modeling_test2(l))
Python
복사

plt.plot(layers, result)
plt.grid()
plt.show()
Python
복사

실습 - 이진 분류

•

활용할 데이터 : Attrition_train_validation.csv

필요한 라이브러리 로드

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

from sklearn.model_selection import train_test_split
from sklearn.metrics import *
from sklearn.preprocessing import MinMaxScaler

from keras.models import Sequential
from keras.layers import Dense, Input
from keras.backend import clear_session
from keras.optimizers import Adam
Python
복사

학습 곡선 함수 정의

def dl_history_plot(history):
		plt.figure(figsize=(10,6))
		plt.plot(history['loss'], label='train_err', marker='.')
		plt.plot(history['val_loss'], label='val_err', marker='.')
		
		plt.ylabel('Loss')
		plt.xlabel('Epoch')
		plt.legend()
		plt.grid()
		plt.show()
Python
복사

# data load
path = 'https://raw.githubusercontent.com/DA4BAM/dataset/master/Attrition_train_validation.CSV'
data = pd.read_csv(path)
data['Attrition'] = np.where(data['Attrition']=='Yes',1,0)
data.head(10)
Python
복사

Attrition	Age	BusinessTravel	Department	DistanceFromHome	Education	EducationField	EmployeeNumber	EnvironmentSatisfaction	Gender	...	OverTime	PercentSalaryHike	RelationshipSatisfaction	StockOptionLevel	TotalWorkingYears	TrainingTimesLastYear	WorkLifeBalance	YearsAtCompany	YearsInCurrentRole	YearsWithCurrManager
0	0	33	Travel_Rarely	Research & Development	7	3	Medical	817	3	Male	...	No	11	4	0	14	3	4	13	9
1	0	35	Travel_Frequently	Research & Development	18	2	Life Sciences	1412	3	Male	...	No	11	3	0	10	2	3	2	2
2	0	42	Travel_Rarely	Research & Development	6	3	Medical	1911	3	Male	...	No	13	2	1	18	3	4	13	7
3	0	46	Travel_Rarely	Sales	2	3	Marketing	1204	3	Female	...	No	23	1	0	28	2	3	26	15
4	0	39	Travel_Frequently	Sales	20	3	Life Sciences	1812	3	Male	...	No	18	4	1	7	6	3	2	1
5	1	22	Travel_Frequently	Research & Development	4	1	Technical Degree	593	3	Male	...	No	16	3	0	4	3	3	2	2
6	0	24	Travel_Rarely	Research & Development	21	2	Technical Degree	1551	3	Male	...	No	14	2	3	2	3	3	1	1
7	0	34	Travel_Rarely	Research & Development	8	3	Medical	2068	2	Male	...	No	12	1	0	6	3	4	4	3
8	0	30	Travel_Rarely	Research & Development	20	3	Other	1084	3	Male	...	No	15	3	1	7	1	2	6	2
9	0	26	Travel_Rarely	Research & Development	6	3	Life Sciences	686	3	Female	...	Yes

•

데이터 정리

# 불필요한 변수 제거
data.drop('EmployeeNumber', axis=1, inplace=True)

# x, y 분리
target = 'Attrition'
x = data.drop(target, axis=1)
y = data[target]


# x 범주형 변수 -> 가변수화
dum_cols = ['BusinessTravel','Department','Education','EducationField','EnvironmentSatisfaction','Gender',
            'JobRole', 'JobInvolvement', 'JobSatisfaction', 'MaritalStatus', 'OverTime', 'RelationshipSatisfaction',
            'StockOptionLevel','WorkLifeBalance' ]
            
x = pd.get_dummies(x, columns=dum_cols, drop_first=True)
Python
복사

•

train / val 분할

x_train, x_val, y_train, y_val = train_test_split(x, y, test_size=200, random_state=0)
Python
복사

•

스케일링

scaler = MinMaxScaler()
x_train = scaler.fit_transform(x_train)
x_val = scaler.transform(x_val)
Python
복사

모델 1

n = x_train.shape[1]

clear_session()

model1 = Sequential([Input(shape=(n, )),
										Dense(12, activation='relu'),
										Dense(1, activation='sigmoid')])

model1.summary()										
Python
복사

# compile + train
model.compile(optimizer=Adam(learning_rate=0.01), loss='binary_crossentropy', metrics=['recall'])
hist1 = model.fit(x_train, y_train, epochs=30, validation_split=.2, verbose=0).history
Python
복사

dl_history_plot(hist1)
Python
복사

# predict
pred = model.predict(x_val)
pred1 = np.where(pred >= 0.5, 1, 0)

print(classification_report(y_val, pred1))
Python
복사

모델 2

clear_session()


model2 = Sequential([Input(shape=(n,)),
										Dense(25, activation='relu'),
										Dense(12, activation='relu'),
										Dense(4, activation='relu'),
										Dense(1, activation='sigmoid'),])
										
model2.summary()
Python
복사

# compile + train
model2.compile(optimizer=Adam(learning_rate=0.01), loss='binary_crossentropy', metrics=['recall'])
hist2 = model2.fit(x_train, y_train, validation_split=.2, epochs=20, verbose=0).history
Python
복사

dl.history_plot(hist2)
Python
복사

# predict
pred = model2.predict(x_val)
pred2 = np.where(pred >= 0.5, 1,0)

print(classification_report(y_val, pred2))
Python
복사

실습 - 다중 분류

와인 분류 문제

필요한 라이브러리 다운로드

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

from sklearn.model_selection import train_test_split
from sklearn.metrics import *
from sklearn.preprocessing import MinMaxScaler

from keras.models import Sequential
from keras.layers import Dense, Input
from keras.backend import clear_session
from keras.optimizers import Adam
Python
복사

•

학습 곡선 함수

# 학습곡선 함수
def dl_history_plot(history):
    plt.figure(figsize=(10,6))
    plt.plot(history['loss'], label='train_err', marker = '.')
    plt.plot(history['val_loss'], label='val_err', marker = '.')

    plt.ylabel('Loss')
    plt.xlabel('Epoch')
    plt.legend()
    plt.grid()
    plt.show()
Python
복사

•

데이터 로드

•

사용할 데이터 : winequality-white.csv

◦

y 범주 : 0 ~ 4

path = "https://raw.githubusercontent.com/DA4BAM/dataset/master/winequality-white.csv"
data = pd.read_csv(path)
data['quality'] = np.where(data['quality'] == 3, 4, np.where(data['quality'] == 9, 8, data['quality']))
data['quality'] = data['quality'] - 4
data.head()
data['quality'].value_counts()
Python
복사

fixed acidity	volatile acidity	citric acid	residual sugar	chlorides	free sulfur dioxide	total sulfur dioxide	density	pH	sulphates	alcohol	quality
0	7.0	0.27	0.36	20.7	0.045	45.0	170.0	1.0010	3.00	0.45	8.8
1	6.3	0.30	0.34	1.6	0.049	14.0	132.0	0.9940	3.30	0.49	9.5
2	8.1	0.28	0.40	6.9	0.050	30.0	97.0	0.9951	3.26	0.44	10.1
3	7.2	0.23	0.32	8.5	0.058	47.0	186.0	0.9956	3.19	0.40	9.9
4	7.2	0.23	0.32	8.5	0.058	47.0	186.0	0.9956	3.19	0.40	9.9

	count
quality
2	2198
1	1457
3	880
0	183
4	180

•

데이터 전처리

# x, y 
target = 'quality'
x = data.drop(target, axis=1)
y = data[target]

# train , val 
x_train, x_val, y_train, y_val = train_test_split(x, y, test_size= .3, random_state = 20)


# scaling
scaler = MinMaxScaler()
x_train = scaler.fit_transform(x_train)
x_val = scaler.transform(x_val)
Python
복사

모델 1

n = x_train.shape[1]

# 모델 설계
clear_session()
model1 = Sequential([Input(shape=(n, )),
                     Dense(5, activation='softmax')])

model1.summary()
Python
복사

# compile
model1.compile(optimizer=Adam(learning_rate=0.01),
               loss='sparse_categorical_crossentropy') # 클래스 범주값이 label형이라서

hist1 = model1.fit(x_train, y_train, epochs=100, verbose=0, validation_split=.2).history
Python
복사

dl_history_plot(hist1)
Python
복사

# 예측 결과
pred = model1.predict(x_val)
pred1 = np.argmax(pred, axis=1)

print(classification_report(y_val, pred1))
Python
복사

모델 2

clear_session()
model2 = Sequential([Input(shape=(n, )),
                     Dense(10, activation='relu'),
                     Dense(8, activation='relu'),
                     Dense(5, activation='softmax')])

model2.summary()
Python
복사

# compile
model2.compile(optimizer=Adam(learning_rate=0.001),
               loss='sparse_categorical_crossentropy')

	hist2 = model2.fit(x_train, y_train, epochs=50, verbose=0, validation_split=.2).history
Python
복사

dl_history_plot(hist2)
Python
복사

# 예측 결과
pred = model2.predict(x_val)
pred2 = np.argmax(pred, axis=1)

print(classification_report(y_val, pred2))
Python
복사

MNIST 이미지

•

필요한 라이브러리 로드

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

from sklearn.model_selection import train_test_split
from sklearn.metrics import *
from sklearn.preprocessing import StandardScaler, MinMaxScaler

from keras.models import Sequential
from keras.layers import Dense, Flatten, Input
from keras.backend import clear_session
from keras.optimizers import Adam
from keras.datasets import mnist, fashion_mnist
Python
복사

•

데이터 로드

(x_train, y_train), (x_val, y_val) = mnist.load_data()
x_train.shape, y_train.shape
Python
복사
((60000, 28, 28), (60000,))

class_names = ['0','1','2','3','4','5','6','7','8','9']
Python
복사

•

이미지 데이터 1개 확인해보기

n = 400

plt.figure()
plt.imshow(x_train[n], cmap=plt.cm.binary)
plt.colorbar()
plt.show()
Python
복사

np.set_printoptions(linewidth=1000)
x_train[n]
Python
복사

•

데이터를 2차원으로 펼치기

x_train = x_train.reshape(60000, -1)
x_val = x_val.reshape(10000, -1)
x_train.shape, x_val.shape
Python
복사
((60000, 784), (10000, 784))

•

image scaling

# 이미지는 0 ~ 255 사이의 값이므로 그냥 255로 나눔
x_train = x_train / 255.
x_val = x_val / 255.
Python
복사

•

모델링

nfeatures = x_train.shape[1]

clear_session()

model = Sequential([Input(shape = (nfeatures,)),
                    Dense(10, activation = 'softmax')])

model.summary()
Python
복사

# compile
model.compile(optimizer=Adam(learning_rate=0.001), loss= 'sparse_categorical_crossentropy' )

history = model.fit(x_train, y_train, epochs = 20, validation_split=0.2).history

dl_history_plot(history)
Python
복사

•

예측 및 평가

pred = model.predict(x_val)
pred_1 = pred.argmax(axis=1)

print(confusion_matrix(y_val, pred_1))
print(classification_report(y_val, pred_1))
Python
복사