모의고사 2회차

모의고사 2회차#

광고 한번 눌러주라

캐글 링크

캐글 데이터셋 링크
문제풀이 유튜브 링크

작업 1유형#

Attention

데이터 출처 : https://www.kaggle.com/datasets/fedesoriano/stroke-prediction-dataset (후처리 작업)
데이터 설명 : 뇌졸증 발생여부 예측
dataurl : https://raw.githubusercontent.com/Datamanim/datarepo/main/stroke/train.csv

import pandas as pd
df = pd.read_csv('https://raw.githubusercontent.com/Datamanim/datarepo/main/stroke_/train.csv')

df.head()

	id	gender	age	ever_married	work_type	Residence_type	avg_glucose_level	bmi	smoking_status	stroke
0	1192	Female	31	No	Govt_job	Rural	70.66	27.2	never smoked	0
1	77	Female	13	No	children	Rural	85.81	18.6	Unknown	0
2	59200	Male	18	No	Private	Urban	60.56	33.0	never smoked	0
3	24905	Female	65	Yes	Private	Urban	205.77	46.0	formerly smoked	1
4	24257	Male	4	No	children	Rural	90.42	16.2	Unknown	0

Question1

성별이 Male인 환자들의 age의 평균값은 ?

44.68623481781376

Question2

bmi컬럼의 결측치를 bmi컬럼의 결측치를 제외한 나머지 값들의 중앙값으로 채웠을 경우 bmi 컬럼의 평균을 소숫점 이하 3자리 까지 구하여라

29.166

Question3

bmi컬럼의 각 결측치들을 직전의 행의 bmi값으로 채웠을 경우 bmi 컬럼의 평균을 소숫점 이하 3자리 까지 구하여라

29.188

Question4

bmi컬럼의 각 결측치들을 결측치를 가진 환자 나이대(10단위)의 평균 bmi 값으로 대체한 후 대체된 bmi 컬럼의 평균을 소숫점 이하 3자리 까지 구하여라

29.2627029367386

Question5

avg_glucose_level 컬럼의 값이 200이상인 데이터를 모두 199로 변경하고 stroke값이 1인 데이터의 avg_glucose_level값의 평균을 소수점이하 3자리 까지 구하여라

125.188

작업 1유형_다른 데이터#

Attention

데이터 출처 : https://www.kaggle.com/abcsds/pokemon (참고, 데이터 수정)
데이터 설명 : 포켓몬 정보
data url = https://raw.githubusercontent.com/Datamanim/datarepo/main/pok/Pokemon.csv

	#	Name	Type 1	Type 2	Total	HP	Attack	Defense	Sp. Atk	Sp. Def	Speed	Generation	Legendary
0	1	Bulbasaur	Grass	Poison	318	45	49	49	65	65	45	1	False
1	2	Ivysaur	Grass	Poison	405	60	62	63	80	80	60	1	False
2	3	Venusaur	Grass	Poison	525	80	82	83	100	100	80	1	False
3	3	VenusaurMega Venusaur	Grass	Poison	625	80	100	123	122	120	80	1	False
4	4	Charmander	Fire	NaN	309	39	52	43	60	50	65	1	False

Question6

Attack컬럼의 값을 기준으로 내림차순정렬 했을때 상위 400위까지 포켓몬들과 401~800위까지의 포켓몬들에서 전설포켓몬(Legendary컬럼)의 숫자 차이는?

Question7

Type 1 컬럼의 종류에 따른 Total 컬럼의 평균값을 내림차순 정렬했을때 상위 3번째 Type 1은 무엇인가?

Flying

Question8

결측치가 존재하는 행을 모두 지운 후 처음부터 순서대로 60% 데이터를 추출하여 Defense컬럼의 1분위수를 구하여라

50.0

Question9

Type 1 컬럼의 속성이 Fire인 포켓몬들의 Attack의 평균이상인 Water속성의 포켓몬 수를 구하여라

Question10

각 세대 중(Generation 컬럼)의 Speed와 Defense 컬럼의 차이(절댓값)이 가장 큰 세대는?

작업 2유형#

Attention

데이터 출처 : https://www.kaggle.com/datasets/fedesoriano/stroke-prediction-dataset (후처리 작업)
데이터 설명 : 뇌졸증 발생여부 예측
train : https://raw.githubusercontent.com/Datamanim/datarepo/main/stroke_/train.csv
test : https://raw.githubusercontent.com/Datamanim/datarepo/main/stroke_/test.csv

import pandas as pd
train= pd.read_csv('https://raw.githubusercontent.com/Datamanim/datarepo/main/stroke_/train.csv')
test= pd.read_csv('https://raw.githubusercontent.com/Datamanim/datarepo/main/stroke_/test.csv')

display(train.head())
display(test.head())

	id	gender	age	ever_married	work_type	Residence_type	avg_glucose_level	bmi	smoking_status	stroke
0	1192	Female	31	No	Govt_job	Rural	70.66	27.2	never smoked	0
1	77	Female	13	No	children	Rural	85.81	18.6	Unknown	0
2	59200	Male	18	No	Private	Urban	60.56	33.0	never smoked	0
3	24905	Female	65	Yes	Private	Urban	205.77	46.0	formerly smoked	1
4	24257	Male	4	No	children	Rural	90.42	16.2	Unknown	0

	id	gender	age	hypertension	ever_married	work_type	Residence_type	avg_glucose_level	bmi	smoking_status
0	47472	Female	58	0	Yes	Private	Urban	107.26	38.6	formerly smoked
1	36841	Male	78	1	Yes	Self-employed	Rural	56.11	25.5	formerly smoked
2	3135	Female	73	0	No	Self-employed	Rural	69.35	NaN	never smoked
3	65218	Male	2	0	No	children	Rural	109.10	20.0	Unknown
4	1847	Female	20	0	No	Govt_job	Rural	79.53	NaN	never smoked

test roc score :  0.8468479025076165

모의고사 2회차

Contents

모의고사 2회차#

작업 1유형#

작업 1유형_다른 데이터#

작업 2유형#