SQL ํ’€์ด#

Hits

Attention

ํ…Œ์ด๋ธ” ์ƒ์„ฑ ๋ฐ ๋ฐ์ดํ„ฐ ๋กœ๋“œ๋Š” ํŒŒ์ด์ฌ ๊ธฐ๋ฐ˜์œผ๋กœ ๋™์ž‘ํ•ฉ๋‹ˆ๋‹ค.
์ ์ ˆํ•œ ์ฟผ๋ฆฌ๋ฅผ ์ž…๋ ฅํ•˜์—ฌ ๋ฐ์ดํ„ฐ ์ „์ฒ˜๋ฆฌ๋ฅผ ์ง„ํ–‰ํ•ฉ๋‹ˆ๋‹ค

์ฝ”๋“œ ๊ฐ€์ด๋“œ#

Attention

  1. connectDB ํ•จ์ˆ˜๋ฅผ ํ†ตํ•ด url csv๋ฅผ ๋กœ์ปฌ์— sqlite3 ํ˜•์‹์˜ ๋ฐ์ดํ„ฐ ๋ฒ ์ด์Šค๋กœ ๋ฐ”๊ฟ‰๋‹ˆ๋‹ค.

  2. ์ฟผ๋ฆฌ๋ฅผ strํ˜•์‹์œผ๋กœ ์ž‘์„ฑํ•ฉ๋‹ˆ๋‹ค(DB table ์ด๋ฆ„์€ ์ฃผ์–ด์ง‘๋‹ˆ๋‹ค)

  3. pd.read_sql({์ฟผ๋ฆฌ},con={ํ…Œ์ด๋ธ”}) ํ˜•์‹์œผ๋กœ ์ฟผ๋ฆฌ๋ฅผ ๋‚ ๋ ค ๊ฒฐ๊ณผ๋ฅผ ์ถœ๋ ฅํ•ฉ๋‹ˆ๋‹ค.

ex) table = connectDB(โ€˜https://raw.githubusercontent.com/Datamanim/pandas/main/lol.csvโ€™,โ€™lolโ€™,โ€™\tโ€™)

url, table_name, sep

sql =โ€™select * from lolโ€™ # query Ans = pd.read_sql(sql,con=table) # result

import pandas as pd
def connectDB(url,table_name,sep=',',encoding='utf-8'):
    import sqlite3
    cnx = sqlite3.connect(':memory:')
    df = pd.read_csv(url,sep=sep,encoding=encoding)
    df.to_sql(name=table_name, con=cnx)    
    return cnx

Question 1

์ „์ฒด ๋ฐ์ดํ„ฐ๋ฅผ ๋กœ๋“œํ•˜๋ผ.

table = connectDB('https://raw.githubusercontent.com/Datamanim/pandas/main/lol.csv','lol','\t')
sql ='select * from lol'
Ans = pd.read_sql(sql,con=table)
display(Ans)
index gameId creationTime gameDuration seasonId winner firstBlood firstTower firstInhibitor firstBaron ... t2_towerKills t2_inhibitorKills t2_baronKills t2_dragonKills t2_riftHeraldKills t2_ban1 t2_ban2 t2_ban3 t2_ban4 t2_ban5
0 0 3326086514 1504279457970 1949 9 1 2 1 1 1 ... 5 0 0 1 1 114 67 43 16 51
1 1 3229566029 1497848803862 1851 9 1 1 1 1 0 ... 2 0 0 0 0 11 67 238 51 420
2 2 3327363504 1504360103310 1493 9 1 2 1 1 1 ... 2 0 0 1 0 157 238 121 57 28
3 3 3326856598 1504348503996 1758 9 1 1 1 1 1 ... 0 0 0 0 0 164 18 141 40 51
4 4 3330080762 1504554410899 2094 9 1 2 1 1 1 ... 3 0 0 1 0 86 11 201 122 18
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
51485 51485 3308904636 1503076540231 1944 9 2 1 2 2 0 ... 10 2 0 4 0 55 -1 90 238 157
51486 51486 3215685759 1496957179355 3304 9 2 1 1 2 2 ... 11 7 4 4 1 157 55 119 154 105
51487 51487 3322765040 1504029863961 2156 9 2 2 2 2 0 ... 10 2 0 2 0 113 122 53 11 157
51488 51488 3256675373 1499562036246 1475 9 2 2 2 2 0 ... 11 3 0 1 0 154 39 51 90 114
51489 51489 3317333020 1503612754059 1445 9 1 1 1 1 1 ... 1 0 0 1 0 11 157 141 31 18

51490 rows ร— 62 columns

Question 2

๋ฐ์ดํ„ฐ์˜ ์ƒ์œ„ 5๊ฐœ ํ–‰์„ ์ถœ๋ ฅํ•˜๋ผ

sql ='select * from lol LIMIT 5' 
Ans = pd.read_sql(sql,con=table)
Ans
index gameId creationTime gameDuration seasonId winner firstBlood firstTower firstInhibitor firstBaron ... t2_towerKills t2_inhibitorKills t2_baronKills t2_dragonKills t2_riftHeraldKills t2_ban1 t2_ban2 t2_ban3 t2_ban4 t2_ban5
0 0 3326086514 1504279457970 1949 9 1 2 1 1 1 ... 5 0 0 1 1 114 67 43 16 51
1 1 3229566029 1497848803862 1851 9 1 1 1 1 0 ... 2 0 0 0 0 11 67 238 51 420
2 2 3327363504 1504360103310 1493 9 1 2 1 1 1 ... 2 0 0 1 0 157 238 121 57 28
3 3 3326856598 1504348503996 1758 9 1 1 1 1 1 ... 0 0 0 0 0 164 18 141 40 51
4 4 3330080762 1504554410899 2094 9 1 2 1 1 1 ... 3 0 0 1 0 86 11 201 122 18

5 rows ร— 62 columns

Question 3

๋ฐ์ดํ„ฐ์˜ ํ–‰์˜ ๊ฐฏ์ˆ˜๋ฅผ ํŒŒ์•…ํ•˜๋ผ

sql ="select count(*) as rowcount from lol;" 
Ans = pd.read_sql(sql,con=table)
Ans
rowcount
0 51490

Question 4

์ „์ฒด ์ปฌ๋Ÿผ์„ ์ถœ๋ ฅํ•˜๋ผ(sqlite๋งŒ์˜ ๋ฌธ๋ฒ•)

sql ="select name from pragma_table_info('lol');" 
Ans = pd.read_sql(sql,con=table)
Ans
name
0 index
1 gameId
2 creationTime
3 gameDuration
4 seasonId
... ...
57 t2_ban1
58 t2_ban2
59 t2_ban3
60 t2_ban4
61 t2_ban5

62 rows ร— 1 columns

Question 5

6๋ฒˆ์งธ ์ปฌ๋Ÿผ๋ช…์„ ์ถœ๋ ฅํ•˜๋ผ

sql ="select name from pragma_table_info('lol') LIMIT 1 OFFSET 5;" 
Ans = pd.read_sql(sql,con=table)
Ans
name
0 winner

Question 6

6๋ฒˆ์งธ ์ปฌ๋Ÿผ์˜ ๋ฐ์ดํ„ฐ ํƒ€์ž…์„ ํ™•์ธํ•˜๋ผ

sql ="select type from pragma_table_info('lol') LIMIT 1 OFFSET 5;" 
Ans = pd.read_sql(sql,con=table)
Ans
type
0 INTEGER

Question 7

gameId ์ปฌ๋Ÿผ์˜ 3๋ฒˆ์งธ ๊ฐ’์€ ๋ฌด์—‡์ธ๊ฐ€?

sql ="SELECT gameId FROM lol LIMIT 1 OFFSET 2;" 
Ans = pd.read_sql(sql,con=table)
Ans
gameId
0 3327363504

Question 8

3๋ฒˆ์งธ ํ–‰์˜ creationTime, gameDuration ์ปฌ๋Ÿผ์— ํ•ด๋‹นํ•˜๋Š” ๊ฐ’์€ ๋ฌด์—‡์ธ๊ฐ€?

sql ="SELECT creationTime,gameDuration FROM lol LIMIT 1 OFFSET 2;" 
Ans = pd.read_sql(sql,con=table)
Ans
creationTime gameDuration
0 1504360103310 1493

Attention

์ œ์ฃผ ๋‚ ์”จ,์ธ๊ตฌ์— ๋”ฐ๋ฅธ ๊ตํ†ต๋Ÿ‰๋ฐ์ดํ„ฐ : ์ถœ์ฒ˜ ์ œ์ฃผ ๋ฐ์ดํ„ฐ ํ—ˆ๋ธŒ DataUrl = โ€˜https://raw.githubusercontent.com/Datamanim/pandas/main/Jeju.csvโ€™

Question 9

๋ฐ์ดํ„ฐ๋ฅผ ๋กœ๋“œํ•˜๋ผ. ์ปฌ๋Ÿผ์ด ํ•œ๊ธ€์ด๊ธฐ์— ์ ์ ˆํ•œ ์ฒ˜๋ฆฌํ•ด์ค˜์•ผํ•จ

table = connectDB('https://raw.githubusercontent.com/Datamanim/pandas/main/Jeju.csv','jeju',encoding='euc-kr') 
# ์ตœ์ดˆ 1ํšŒ๋งŒ ์‹คํ–‰
sql ='SELECT * FROM jeju LIMIT 5' # table_name = jeju
Ans = pd.read_sql(sql,con=table)
Ans
index id ์ผ์ž ์‹œ๋„๋ช… ์๋ฉด๋™๋ช… ๊ฑฐ์ฃผ์ธ๊ตฌ ๊ทผ๋ฌด์ธ๊ตฌ ๋ฐฉ๋ฌธ์ธ๊ตฌ ์ด ์œ ๋™์ธ๊ตฌ ํ‰๊ท  ์†๋„ ํ‰๊ท  ์†Œ์š” ์‹œ๊ฐ„ ํ‰๊ท  ๊ธฐ์˜จ ์ผ๊ฐ•์ˆ˜๋Ÿ‰ ํ‰๊ท  ํ’์†
0 0 22448 2018-01-01 ์ œ์ฃผ์‹œ ๋„๋‘๋™ 32249.987 3418.266 102709.092 138377.345 39.556 29.167 5.0 0.0 2.5
1 1 22449 2018-01-01 ์ œ์ฃผ์‹œ ์™ธ๋„๋™ 213500.997 10341.172 112692.789 336534.958 32.900 30.900 5.0 0.0 2.5
2 2 22450 2018-01-01 ์ œ์ฃผ์‹œ ์ด๋„2๋™ 1212382.218 96920.834 541194.481 1850497.533 29.538 35.692 2.9 0.0 2.4
3 3 22451 2018-01-01 ์ œ์ฃผ์‹œ ์ผ๋„1๋™ 33991.653 6034.253 72155.919 112181.825 30.000 23.500 2.9 0.0 2.4
4 4 22452 2018-01-01 ์„œ๊ท€ํฌ์‹œ ๋Œ€์ฒœ๋™ 155036.925 9403.969 150882.409 315323.303 41.583 14.375 5.1 0.0 2.3

Question 10

๋ฐ์ดํ„ฐ ๋งˆ์ง€๋ง‰ 3๊ฐœํ–‰์„ ์ถœ๋ ฅํ•˜๋ผ

sql ='SELECT * FROM jeju LIMIT 10 OFFSET (SELECT count(*) FROM jeju)-3' # table_name = jeju
Ans = pd.read_sql(sql,con=table)
Ans
index id ์ผ์ž ์‹œ๋„๋ช… ์๋ฉด๋™๋ช… ๊ฑฐ์ฃผ์ธ๊ตฌ ๊ทผ๋ฌด์ธ๊ตฌ ๋ฐฉ๋ฌธ์ธ๊ตฌ ์ด ์œ ๋™์ธ๊ตฌ ํ‰๊ท  ์†๋„ ํ‰๊ท  ์†Œ์š” ์‹œ๊ฐ„ ํ‰๊ท  ๊ธฐ์˜จ ์ผ๊ฐ•์ˆ˜๋Ÿ‰ ํ‰๊ท  ํ’์†
0 9618 32066 2020-04-30 ์ œ์ฃผ์‹œ ๋„๋‘๋™ 28397.481 3144.895 84052.697 115595.073 41.053 29.421 20.3 0.0 3.0
1 9619 32067 2020-04-30 ์„œ๊ท€ํฌ์‹œ ์•ˆ๋•๋ฉด 348037.846 29106.286 251129.660 628273.792 46.595 49.189 17.6 0.0 3.5
2 9620 32068 2020-04-30 ์ œ์ฃผ์‹œ ์—ฐ๋™ 1010643.372 65673.477 447622.068 1523938.917 40.863 27.765 14.1 0.0 4.8

Question 11

์ˆ˜์น˜ํ˜• ๋ณ€์ˆ˜๋ฅผ ๊ฐ€์ง„ ์ปฌ๋Ÿผ๋งŒ ํ•„ํ„ฐํ•˜์—ฌ ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„์„ ๋งŒ๋“ค๊ณ  ์ƒ์œ„ 5ํ–‰์„ ์ถœ๋ ฅํ•˜๋ผ

sql ="SELECT name FROM pragma_table_info('jeju') WHERE type ='INTEGER' or type='REAL';" 
Ans = pd.read_sql(sql,con=table)
Ans
name
0 index
1 id
2 ๊ฑฐ์ฃผ์ธ๊ตฌ
3 ๊ทผ๋ฌด์ธ๊ตฌ
4 ๋ฐฉ๋ฌธ์ธ๊ตฌ
5 ์ด ์œ ๋™์ธ๊ตฌ
6 ํ‰๊ท  ์†๋„
7 ํ‰๊ท  ์†Œ์š” ์‹œ๊ฐ„
8 ํ‰๊ท  ๊ธฐ์˜จ
9 ์ผ๊ฐ•์ˆ˜๋Ÿ‰
10 ํ‰๊ท  ํ’์†