๐Ÿค– Backend/Database

๋Œ€์šฉ๋Ÿ‰ ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค์˜ ์กฐํšŒ ์„ฑ๋Šฅ์„ ๊ฐœ์„ ํ•ด๋ณด์ž (2) : ์ธ๋ฑ์Šค ํŠœ๋‹

sckwon770 2024. 2. 25. 19:14

์ด์ „ Velog์˜ ๋Œ€์šฉ๋Ÿ‰ ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค์˜ ์กฐํšŒ ์„ฑ๋Šฅ์„ ๊ฐœ์„ ํ•ด๋ณด์ž (2) : ์ธ๋ฑ์Šค ํŠœ๋‹ (2023.11.12) ๋กœ๋ถ€ํ„ฐ ๋งˆ์ด๊ทธ๋ ˆ์ด์…˜๋œ ๊ธ€์ž…๋‹ˆ๋‹ค.

Github main repo / Github server repo

 

 

 


๋™๊ธฐ

1ํŽธ์˜ ์นดํ…Œ๊ณ ๋ฆฌ๋ณ„ ๋ฆฌ๋ทฐ ์กฐํšŒ๋ฅผ ์‹œ์ž‘์œผ๋กœ 500ms๊ฐ€ ๋„˜์–ด๊ฐ€๋Š” ์š”์ฒญ๋“ค์„ ์œ„์ฃผ๋กœ Cardinality/Selectivity๋ฅผ ๊ณ ๋ คํ•˜์—ฌ ์ธ๋ฑ์Šค๋ฅผ ์ƒ์„ฑํ–ˆ๋‹ค. ํ•˜์ง€๋งŒ 1ํŽธ๊นŒ์ง€๋Š” ์ง„์งœ ๋Œ€์šฉ๋Ÿ‰ ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค๋Š” ์•„๋‹ˆ์˜€๋‹ค. 2ํŽธ๋ถ€ํ„ฐ ์—ฌ๊ธฐ์–ด๋•Œ์ปดํผ๋‹ˆ ๋ฒค์ฒ˜ ํ”„๋กœ์ ํŠธ์ธ๋งŒํผ ์—ฌ๊ธฐ์–ด๋•Œ ํ™๋ณด์ž๋ฃŒ์— ์žˆ๋Š” ๋ฐ์ดํ„ฐ ๊ทœ๋ชจ์—์„œ ์„ฑ๋Šฅ์„ ์ธก์ •ํ•ด๋ณด๊ณ  ํ•„์š”ํ•˜๋‹ค๋ฉด ๊ฐœ์„ ๊นŒ์ง€ ์ง„ํ–‰ํ•ด๋ณด์ž.

๊ธฐ์กด ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ ์Šค์ผ€์ผ

  • ์—ฌํ–‰์ƒํ’ˆ: 30 (๊ฐ ์นดํ…Œ๊ณ ๋ฆฌ 10๊ฐœ์”ฉ, ์ˆ™์†Œ, ์‹๋‹น, ๋ Œํ„ฐ์นด)
  • ์˜ˆ์•ฝ: 540,000 (์นดํ…Œ๊ณ ๋ฆฌ ๋ณ„ 180,000)
    - ์—ฌํ–‰์ƒํ’ˆ ๋งˆ๋‹ค 18,000๊ฐœ์˜ ์˜ˆ์•ฝ
  • ๋ฆฌ๋ทฐ: 540,000 (์นดํ…Œ๊ณ ๋ฆฌ ๋ณ„ 180,000)
    - ๋ชจ๋“  ์˜ˆ์•ฝ์— ๋ฆฌ๋ทฐ ์ž‘์„ฑ
  • ๋ฆฌ๋ทฐ ํƒœ๊ทธ: 2,700,000 (์นดํ…Œ๊ณ ๋ฆฌ ๋ณ„ 900,000)
    - ๊ฐ ๋ฆฌ๋ทฐ๋‹น 5๊ฐœ์˜ ๋ฆฌ๋ทฐ ํƒœ๊ทธ

 

 


์„ฑ๋Šฅ ํ…Œ์ŠคํŠธ

ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ ๊ตฌ์ถ•

์—ฌ๊ธฐ์–ด๋•Œ ์‚ฌ์ดํŠธ์— ์˜ํ•˜๋ฉด 600๋งŒ ๊ฐœ ์ด์ƒ์˜ ๋ฆฌ๋ทฐ๋ฅผ ์ œ๊ณตํ•˜๊ณ  ์žˆ๋‹ค๊ณ  ํ•œ๋‹ค. ์ด์— ๋งž๊ฒŒ ๋”๋ฏธ ๋ฐ์ดํ„ฐ๋ฅผ ์ถ”๊ฐ€ํ•ด๋ณด์ž. 600๋งŒ ๊ฑด์˜ ์˜ˆ์•ฝ๊ณผ ๋ฆฌ๋ทฐ ๋ฐ์ดํ„ฐ๋ฅผ ์ถ”๊ฐ€ํ•˜๊ณ  ๊ฐ ๋ฆฌ๋ทฐ๋‹น 3๊ฐœ์˜ ๋ฆฌ๋ทฐ ๋ถ„์„ ๊ฐ’์ด ์žˆ๋‹ค๊ณ  ๊ฐ€์ •ํ•˜์ž. ์—ฌ๊ธฐ์–ด๋•Œ์˜ ์ธ๊ธฐ ์ˆ™์†Œ๋“ค์€ 5000๊ฐœ ์ •๋„์˜ ๋ฆฌ๋ทฐ๊ฐ€ ๋‹ฌ๋ฆฌ๋ฏ€๋กœ, ํ•œ ์ƒํ’ˆ ๋‹น 5000๊ฐœ๊ฐ€ ๋“ฑ๋ก๋  ์ˆ˜ ์žˆ๋„๋ก ์ƒํ’ˆ์˜ ๊ฐœ์ˆ˜์™€ ์—ฐ๊ด€๊ด€๊ณ„๋ฅผ ์„ค์ •ํ–ˆ๋‹ค.

  • ์—ฌํ–‰์ƒํ’ˆ: 1200
  • ์˜ˆ์•ฝ: 6,000,000
    - ์—ฌํ–‰์ƒํ’ˆ ๋งˆ๋‹ค 5,000๊ฐœ์˜ ์˜ˆ์•ฝ
  • ๋ฆฌ๋ทฐ: 6,000,000
    - ๋ชจ๋“  ์˜ˆ์•ฝ์— ๋ฆฌ๋ทฐ ์ž‘์„ฑ
  • ๋ฆฌ๋ทฐ ํƒœ๊ทธ: 18,000,000
    - ๊ฐ ๋ฆฌ๋ทฐ๋‹น 3๊ฐœ์˜ ๋ฆฌ๋ทฐ ํƒœ๊ทธ

์—ฌ๊ธฐ์–ด๋•Œ ๊ณต์‹ ์‚ฌ์ดํŠธ ( https://gccompany.co.kr/ )

 


์„ฑ๋Šฅ ์ธก์ •

์š”์ฒญ ์š”์ฒญ ์ฒ˜๋ฆฌ ์‹œ๊ฐ„
๋ฆฌ๋ทฐ ๋‹จ์ผ ์กฐํšŒ 30 ms
๋ฆฌ๋ทฐ ๋ชฉ๋ก ์กฐํšŒ (LIMIT 10) 37 ms
์นดํ…Œ๊ณ ๋ฆฌ ๋ณ„ ๋ฆฌ๋ทฐ ๋ชฉ๋ก ์กฐํšŒ (LIMIT 10) 7504 ms
์นดํ…Œ๊ณ ๋ฆฌ+ํ‚ค์›Œ๋“œ ๋ณ„ ๋ฆฌ๋ทฐ ๋ชฉ๋ก ์กฐํšŒ (LIMIT 10) 4550 ms
๋ณ„์ ์ˆœ ๋ฆฌ๋ทฐ ๋ชฉ๋ก ์กฐํšŒ (LIMIT 10) 41 ms
๊ธ์ •์ ์ธ์ˆœ ๋ฆฌ๋ทฐ ๋ชฉ๋ก ์กฐํšŒ (LIMIT 10) 35 ms
์ƒํ’ˆ์˜ ๋ฆฌ๋ทฐ ํ†ต๊ณ„ ์กฐํšŒ 71 ms
์ƒํ’ˆ์˜ ๋ฆฌ๋ทฐํƒœ๊ทธ ํ†ต๊ณ„ ์กฐํšŒ 107 ms

 

์ •์ƒ์ ์œผ๋กœ ์ธ๋ฑ์Šค๋ฅผ ํƒ€์„œ ์ค€์ˆ˜ํ•œ ์„ฑ๋Šฅ์„ ๋ณด์ด์ง€๋งŒ, ์ผ๋ถ€ ๋น„์ •์ƒ์ ์ธ ์š”์ฒญ์ด ์žˆ๋‹ค. 1ํŽธ์—์„œ ์ง‘์ค‘์ ์œผ๋กœ ๋‹ค๋ฃฌ ๋ฆฌ๋ทฐ ์†์„ฑ๊ณผ ํ‚ค์›Œ๋“œ ํ•„ํ„ฐ๋ง ์กฐํšŒ ์š”์ฒญ์ด๋‹ค. ํŠน์ • ํƒœ๊ทธ๋ฅผ ๊ฐ€์ง„ ๋ฆฌ๋ทฐ๋ฅผ ์ •๋ ฌ ๋ฐ ํŽ˜์ด์ง•ํ•ด์•ผ ํ•˜๋ฏ€๋กœ, ์ธ๋ฑ์Šค๊ฐ€ ์ ์šฉ๋˜์–ด๋„ full scanํ•ด์•ผ ํ•œ๋‹ค. ์•„๋ž˜ ์‹œํ–‰ ๊ณ„ํš์„ ๋ณด๋ฉด Full Index Scan์ด ์‹คํ–‰๋˜์—ˆ๊ณ  17802236 rows๊ฐ€ ์กฐํšŒ๋œ ๊ฒƒ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค.

 

id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1 SIMPLE review1_ null index PRIMARY review_created_at_index 9 null 10 100 null
1 SIMPLE <subquery2> null eq_ref <auto_distinct_key> <auto_distinct_key> 8 Reviewmate.review1_.review_id 1 100 null
2 MATERIALIZED reviewtag4_ null index review_tag_review_id_property_keyword_index,review_tag_review_id_property_polarity_index review_tag_review_id_property_polarity_index 2052 null 17902236 10 Using where; Using index

 

 


๊ฐœ์„ 

๋ฌธ์ œ ํŒŒ์•…

select
        review1_.review_id as review_i1_8_,
        review1_.created_at as created_2_8_,
        review1_.updated_at as updated_3_8_,
        review1_.content as content4_8_,
        review1_.negative_tags_count as negative5_8_,
        review1_.polarity as polarity6_8_,
        review1_.positive_tags_count as positive7_8_,
        review1_.rating as rating8_8_,
        review1_.reservation_id as reserva10_8_,
        review1_.title as title9_8_
    from
        review review1_
    where
            exists (
                select
                    1
                from
                    review_tag reviewtag4_
                where
                    reviewtag4_.review_id=review1_.review_id
                    and reviewtag4_.property="ROOM"
            )
    order by
        review1_.created_at desc limit 10;

ํ˜„์žฌ ์ ์šฉ๋˜์–ด ์žˆ๋Š” ์ธ๋ฑ์Šค๋Š” 2๊ฐ€์ง€ ์ด๋‹ค.

  • reviewtag index (review_id, property)
  • review index (created_at DESC)

where ์ ˆ์—์„œ ์‚ฌ์šฉ๋˜๋Š” ํ…Œ์ด๋ธ”๋“ค์— ์ธ๋ฑ์Šค๋ฅผ ์ ์šฉํ•˜์˜€๊ณ  SQL explain์„ ํ†ตํ•ด ์ •์ƒ์ ์œผ๋กœ ์ธ๋ฑ์Šค๋ฅผ ํƒ€๊ณ  ์žˆ์Œ์—๋„ ๊ธด ์‹œ๊ฐ„์ด ์†Œ๋ชจ๋œ๋‹ค. ๋„์ €ํžˆ ๋ชจ๋ฅด๊ฒ ์–ด์„œ ๋™์•„๋ฆฌ ์„ ๋ฐฐ๋‹˜๋“ค์ด ๋ชจ์—ฌ์žˆ๋Š” ์Šฌ๋ž™์—์„œ ์กฐ์–ธ์„ ๊ตฌํ•ด๋ดค๋‹ค.

๊ทธ๋ ‡๋‹ค. ์‹ค์ œ๋กœ SQL explain์— ๋”ฐ๋ฅด๋ฉด reviewtag scanning rows๋Š” 17,902,236๋‚˜ ๋œ๋‹ค. ์ฟผ๋ฆฌ๋กœ ์ง์ž‘ํ•˜๊ฑด๋ฐ, exists๋ฅผ ํ†ตํ•ด ๋ฆฌ๋ทฐ์— ๋‹ฌ๋ฆฐ ๋ฆฌ๋ทฐํƒœ๊ทธ ์ค‘์—์„œ ROOM property๊ฐ€ ์žˆ๋Š” ๋ฆฌ๋ทฐ๋ฅผ 10๊ฐœ ์ฐพ์„ ๋•Œ๊นŒ์ง€ ์„ ํ˜•์ ์œผ๋กœ ํ’€ ์Šค์บ”์„ ํ•˜๊ณ  ์žˆ๋Š” ๊ฒƒ์ด๋ผ๊ณ  ์ง์ž‘์ด ๋œ๋‹ค. ๋ฉ”์ธ ์ฟผ๋ฆฌ์˜ ์Šค์บ” ๋ฒ”์œ„๋ฅผ ์ค„์—ฌ์•ผํ•˜๋Š”๋ฐ 1800๋งŒ๊ฐœ๋ฅผ ์Šค์บ”ํ•˜๊ณ  ์žˆ์œผ๋ฉด ์˜ค๋ž˜ ๊ฑธ๋ฆด ์ˆ˜ ๋ฐ–์— ์—†์„ ๊ฒƒ์ด๋‹ค.

์˜ตํ‹ฐ๋งˆ์ด์ €๋Š” SQL์„ ์ตœ๋Œ€ํ•œ ํšจ์œจ์ ์œผ๋กœ ๋™์ž‘ํ•˜๊ธฐ ์œ„ํ•ด ๋‹ค์–‘ํ•œ ์ตœ์ ํ™”๋ฅผ ์ž๋™์œผ๋กœ ์ ์šฉํ•œ๋‹ค๊ณ  ์•Œ๊ณ  ์žˆ๋Š”๋ฐ, ์ •๋ง ๋ฐฉ๋ฒ•์ด ์—†๋Š” ๊ฒƒ์ผ๊นŒ? ์ฟผ๋ฆฌ์˜ ์‹คํ–‰ ๊ณ„ํš์„ ๋” ์ž์„ธํžˆ ์•Œ์•˜๋Š”๋ฐ, ์ˆœ์„œ๊ฐ€ ์˜ˆ์ƒ๊ณผ๋Š” ์กฐ๊ธˆ ๋‹ฌ๋ž๋‹ค. ํ•˜๋‚˜ํ•˜๋‚˜ ์กฐ์‚ฌํ•ด๋ณด์ž.

 

explain analysis graph

1. property filter

์šฐ์„  ์ „์ฒด ๋ฆฌ๋ทฐํƒœ๊ทธ ์ค‘์—์„œ property ์ปฌ๋Ÿผ์ด ROOM์ธ ๊ฒƒ๋งŒ ํ•„ํ„ฐ๋งํ•œ๋‹ค.

2. deduplication

๋ฆฌ๋ทฐ์—๋Š” 3๊ฐœ์˜ ๋ฆฌ๋ทฐํƒœ๊ทธ๊ฐ€ ์—ฐ๊ด€๋˜๋ฏ€๋กœ, ์ค‘๋ณต๋˜๋Š” ํƒœ๊ทธ๋“ค์„ ์ œ๊ฑฐํ•˜๋Š” deduplication ๊ณผ์ •์„ ๊ฑฐ์นœ๋‹ค. ๋‹ค์ด์–ด๊ทธ๋žจ์— ์ ํ˜€์žˆ๋Š” rows๋Š” ํ†ต๊ณ„๋ฅผ ํ†ตํ•œ ์ถ”์ฒญ์น˜์ด๊ณ , ์‹ค์ œ๋กœ propert๊ฐ€ ROOM ํƒœ๊ทธ๋งŒ ๊ณจ๋ž๋‹ค๋ฉด 800,000 ์ •๋„ ๋œ๋‹ค.

Materialize๋Š” MySQL์ด ์„œ๋ธŒ์ฟผ๋ฆฌ๋‚˜ ๋ณต์žกํ•œ ์กฐ์ธ์„ ์ฒ˜๋ฆฌํ•  ๋•Œ, ์ค‘๊ฐ„ ๊ฒฐ๊ณผ๋ฅผ ์ž„์‹œ ํ…Œ์ด๋ธ”์— ์ €์žฅํ•˜๋Š” ๊ณผ์ •์„ ๋งํ•œ๋‹ค. ๋”์šฑ ํšจ์œจ์ ์œผ๋กœ ์ฟผ๋ฆฌ๋ฅผ ์ฒ˜๋ฆฌํ•  ์ˆ˜ ์žˆ๊ณ  ๋ฐ˜๋ณต์ ์ธ ๊ณ„์‚ฐ์„ ์ค„์ผ ์ˆ˜ ์žˆ๋‹ค.

3. FK๋ฅผ ํ†ตํ•œ Nested loop inner join

ํ•„ํ„ฐ๋ง๊ณผ ์ค‘๋ณต์ด ์ œ๊ฑฐ๋œ ๋ฆฌ๋ทฐ ํƒœ๊ทธ๋ฅผ review์™€ join์„ ํ†ตํ•ด ๋ฐ์ดํ„ฐ๋ฅผ ์กฐํšŒํ•œ๋‹ค.

4. LIMIT, SELECT

๋ฌผ๋ก  MySQL ๋‚ด๋ถ€์ ์œผ๋กœ ์ž˜ ์ตœ์ ํ™”๋˜์–ด 800000๊ฐœ ์ „๋ถ€ Joinํ•˜์ง€๋Š” ์•Š๊ณ  ํ•„์š”ํ•œ 10๊ฐœ๋งŒ Join ํ›„ ๊ฒฐ๊ณผ๋ฅผ ๋ฐ˜ํ™˜ํ–ˆ์„ ๊ฒƒ์ด๋‹ค.

 


์ตœ์ ํ™”

์ด์ƒํ•˜๋‹ค.. ๋‚˜๋Š” ๋ถ„๋ช… reviewtag4_.review_id=review1_.review_id and reviewtag4_.property="ROOM" ๋กœ where ์ ˆ์„ ์ž‘์„ฑํ–ˆ์œผ๋ฏ€๋กœ ์ธ๋ฑ์Šค๋„ ์ด ์ˆœ์„œ๋กœ ์ ์šฉํ•˜์˜€๋‹ค. ํ•˜์ง€๋งŒ ์‹ค์ œ ์‹คํ–‰ ์ˆœ์„œ๋ฅผ ๋ฐ˜๋Œ€์˜€๋‹ค. ์•„๋งˆ๋„, 1:N ๊ด€๊ณ„์ธ ๋ฆฌ๋ทฐ์™€ ๋ฆฌ๋ทฐํƒœ๊ทธ ๊ด€๊ณ„ ์ƒ, ๋ฆฌ๋ทฐ์™€ ๊ด€๊ณ„๋œ ๋ฆฌ๋ทฐํƒœ๊ทธ๋ฅผ ๋ถˆ๋Ÿฌ์™€์„œ property๋ฅผ ํ™•์ธํ•˜๋Š” ๊ฒƒ๋ณด๋‹ค property๋ฅผ ์ถฉ์กฑํ•˜๋Š” ๋ฆฌ๋ทฐํƒœ๊ทธ์˜ ์ค‘๋ณต์„ ์ œ๊ฑฐํ•˜๊ณ  ๋ฆฌ๋ทฐ์™€ ์—ฐ๊ด€์‹œํ‚ค๋Š”๊ฒŒ ํšจ์œจ์ ์ด๋ผ๊ณ  ํŒ๋‹จํ•œ ๊ฒƒ ๊ฐ™๋‹ค.

๊ทธ๋ ‡๋‹ค๋ฉด ์ด ์ˆœ์„œ์— ๋งž๊ฒŒ ์ธ๋ฑ์Šค๋ฅผ ์ ์šฉํ•ด๋ณด๋ฉด ์–ด๋–จ๊นŒ? ๊ฒฐ๊ณผ๋Š” ์„ฑ๊ณต์ ์ด๋‹ค!

  • reviewtag index (property, review_id)

 

explain analysis graph

property filter์™€ deduplication์ด ํ•œ ๋‹จ๊ณ„๋กœ ํ•ฉ์ณ์กŒ๋‹ค. ์ด์ „์—๋Š” ์ˆœ์„œ๊ฐ€ ๋งž์ง€ ์•Š์•„ property filter ์ดํ›„ review_id๋ฅผ ํ†ตํ•œ deduplicationdl ๋ถˆ๊ฐ€๋Šฅํ–ˆ์ง€๋งŒ, ์ด์ œ๋Š” covering index๋กœ ํ•œ ๋ฒˆ์— ๊ฐ€๋Šฅํ•˜์—ฌ ๋‘ ๋‹จ๊ณ„๊ฐ€ ํ†ตํ•ฉ๋œ ๊ฒƒ ๊ฐ™๋‹ค. ๋‹จ์ ์œผ๋กœ ์ˆ˜์น˜๋งŒ ๋น„๊ตํ•˜์ž๋ฉด 100๋งŒ ๊ฑด ์ด์ƒ์˜ ์Šค์บ”์ด ์ƒ๋žต๋œ ๊ฒƒ์ด๋‹ค. ๊ทธ ๊ฒฐ๊ณผ ์ฟผ๋ฆฌ ์‹คํ–‰์‹œ๊ฐ„๋„ 7.5์ดˆ์—์„œ 1.5์ดˆ๋กœ ๋‹จ์ถ•๋˜์—ˆ๋‹ค.

explain analysis

-> Limit: 10 row(s)  (cost=2.05e+6 rows=10) (actual time=1552..1552 rows=10 loops=1)
    -> Nested loop inner join  (cost=2.05e+6 rows=14.5e+6) (actual time=1552..1552 rows=10 loops=1)
        -> Index scan on review1_ using review_created_at_index  (cost=0.0707 rows=10) (actual time=0.0462..0.0762 rows=14 loops=1)
        -> Single-row index lookup on <subquery2> using <auto_distinct_key> (review_id=review1_.review_id)  (cost=466650..466650 rows=1) (actual time=111..111 rows=0.714 loops=14)
            -> Materialize with deduplication  (cost=466649..466649 rows=1.45e+6) (actual time=1552..1552 rows=800000 loops=1)
                -> Covering index lookup on reviewtag4_ using review_tag_property_review_id_index (property='ROOM')  (cost=321768 rows=1.45e+6) (actual time=0.0266..323 rows=800000 loops=1)

 


์‹ค์ œ ์Šค์ผ€์ผ ์ ์šฉ

์•„์ง ๋น ๋“œ๋ฆฐ ๊ฒƒ์ด ์žˆ๋‹ค. ์„ฑ๋Šฅ ์ €ํ•˜๊ฐ€ ๋ฐœ์ƒํ•˜๋Š” ๋ฆฌ๋ทฐ ํƒœ๊ทธ ๋ถ€๋ถ„์ด ์ถ”๊ฐ€๋œ ์ฟผ๋ฆฌ๋งŒ ๋‹ค๋ค˜๋Š”๋ฐ, ์ด์ œ๋Š” ์ƒํ’ˆ์ด ์†Œ์†๋œ ์—ฌํ–‰์‚ฌ์™€ ๋ฆฌ๋ทฐ๊ฐ€ ์†Œ์†๋œ ์ƒํ’ˆ์„ ๋น„๊ตํ•˜๋Š” ์กฐ๊ฑด์ ˆ๋„ ์ถ”๊ฐ€๋˜์–ด์•ผ ํ•œ๋‹ค. ์ด์— ๋งž๊ฒŒ ์ธ๋ฑ์Šค๋„ ์ •๋ˆํ•˜๋ฉด ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค.

์‹คํ–‰์‹œ๊ฐ„๋„ 1500 ms -> 120 ms๋กœ ๋Œ€ํญ ๊ฐ์†Œ๋˜์—ˆ๋Š”๋ฐ, ์ด์ „๊นŒ์ง€๋Š” 2400๋งŒ ๊ฐœ(๋ฆฌ๋ทฐ 600๋งŒ + ๋ฆฌ๋ทฐํƒœ๊ทธ 1200๋งŒ)๋ฅผ ๋Œ€์ƒ์œผ๋กœ ์กฐํšŒํ–ˆ๋‹ค๋ฉด ํ˜„์žฌ๋Š” 2๋งŒ ๊ฐœ(๋ฆฌ๋ทฐ 5000 + ๋ฆฌ๋ทฐํƒœ๊ทธ 10000)๋ฅผ ๋Œ€์ƒ์œผ๋กœ ์กฐํšŒํ•˜๊ณ  ์žˆ๋‹ค. ๋”ฐ๋ผ์„œ ์„ฑ๋Šฅ ์ €ํ•˜๊ฐ€ ๋ฐœ์ƒํ•˜๋Š” ๋ฆฌ๋ทฐ์™€ ๋ฆฌ๋ทฐํƒœ๊ทธ์˜ ๊ฐœ์ˆ˜๊ฐ€ ์ค„์–ด๋“ค์–ด scanning rows ์ž์ฒด๊ฐ€ ์ค„์–ด๋“ค์–ด์„œ์ด๋‹ค.

 

 


๋งˆ๋ฌด๋ฆฌ

๊ธด ์‹œ๊ฐ„ ๋™์•ˆ ์ˆ˜ ๋งŽ์€ ์‹œ๋„์™€ ์ฐฉ์˜ค์˜ ๋์— 2400๋งŒ ๊ฐœ์˜ ๋ฐ์ดํ„ฐ(๋ฆฌ๋ทฐ 600๋งŒ + ๋ฆฌ๋ทฐํƒœ๊ทธ 1200๋งŒ)๋ฅผ ๋Œ€์ƒ์œผ๋กœ 94%์˜ ์„ฑ๋Šฅ ๊ฐœ์„ ์„ ์ด๋ค„๋‚ด ๋งค์šฐ ๋งŒ์กฑ์Šค๋Ÿฌ์šด ๊ฒฐ๊ณผ๋กœ ๋งˆ๋ฌด๋ฆฌ๋˜์—ˆ๋‹ค. ๊ฐ€์žฅ ํฌ๊ฒŒ ๋ฐฐ์šด ๊ฒƒ์€ ์ฟผ๋ฆฌ ์‹คํ–‰ ๊ณ„ํš ๋ถ„์„๋ฒ•๊ณผ ์ธ๋ฑ์Šค ์ƒ์„ฑ ์ „๋žต์ธ ๊ฒƒ ๊ฐ™๋‹ค. ์ฒ˜์Œ ์ธ๋ฑ์Šค๋ฅผ ๋ฐฐ์šฐ๋ฉด์„œ ๋ถˆ์™„์ „ํ•˜๊ฒŒ ๋ฐฐ์šด ์ด๋ก ๋งŒ์„ ๋ฐ”ํƒ•์œผ๋กœ ์ตœ์ ํ™”ํ•˜๋‹ค๋ณด๋‹ˆ ๋งŽ์€ ์‹œํ–‰์ฐฉ์˜ค๋ฅผ ๊ฒช์—ˆ์ง€๋งŒ, ๊ทธ ๊ณผ์ •์—์„œ ์‹ค์ œ ๋™์ž‘ํ•˜๋Š” DB ์—”์ง„๊ณผ ์˜ตํ‹ฐ๋งˆ์ด์ €์˜ ๋™์ž‘์— ๋งž๊ฒŒ ์ธ๋ฑ์Šค๋ฅผ ํŠœ๋‹ํ•ด๋‚˜๊ฐ€๋Š” ๋ฒ•์„ ๋ฐฐ์šธ ์ˆ˜ ์žˆ์—ˆ๋‹ค. ํŠนํžˆ, "์ธ๋ฑ์Šค์˜ ์ˆœ์„œ๊ฐ€ ๋งž์ง€ ์•Š์œผ๋ฉด ์ œ๋Œ€๋กœ ์ธ๋ฑ์Šค๋ฅผ ์ œ๋Œ€๋กœ ํƒ€์ง€ ์•Š๋Š” ๊ฒƒ"๊ณผ "์ธ๋ฑ์Šค๋ฅผ ์ œ๋Œ€๋กœ ํƒ€์ง€ ์•Š๋Š” ๊ฒƒ"์˜ ์ง„์ •ํ•œ ์˜๋ฏธ๋ฅผ ๊ฒฝํ—˜ํ•  ์ˆ˜ ์žˆ์—ˆ๋‹ค. ๊ฐ€๋ณ๊ฒŒ ์ด๋ก ์„ ๋ฐฐ์šฐ๊ณ  ์ธ๋ฑ์Šค๋ฅผ ํ†ตํ•œ ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค ํŠœ๋‹์„ ์ง์ ‘ ๋ถ€๋”ชํ˜€๋ณด๋ฉฐ ๋งŽ์€ ๊ฒƒ์„ ๋ฐฐ์› ์œผ๋‹ˆ, ์ด ๊ฒฝํ—˜์„ ๋ฐ”ํƒ•์œผ๋กœ ๋” ๊นŠ์€ ์ด๋ก ๋„ ๋ฐฐ์›Œ๋ณผ๊นŒ ํ•œ๋‹ค.

 

์„ฑ๋Šฅ ๊ฐœ์„  ๊ฒฐ๊ณผ

์š”์ฒญ๊ธฐ์กด ์กฐํšŒ ์‹œ๊ฐ„์ „์ฒด ๋ฆฌ๋ทฐ ๋Œ€์ƒ ์กฐํšŒ ์‹œ๊ฐ„์ƒํ’ˆ ๋ฆฌ๋ทฐ ๋Œ€์ƒ ์กฐํšŒ ์‹œ๊ฐ„

๋ฆฌ๋ทฐ ๋‹จ์ผ ์กฐํšŒ 30 ms 33 ms 32 ms
๋ฆฌ๋ทฐ ๋ชฉ๋ก ์กฐํšŒ (LIMIT 10) 37 ms 35 ms 113 ms
์นดํ…Œ๊ณ ๋ฆฌ ๋ณ„ ๋ฆฌ๋ทฐ ๋ชฉ๋ก ์กฐํšŒ (LIMIT 10) 7504 ms 1555 ms 132 ms
์นดํ…Œ๊ณ ๋ฆฌ+ํ‚ค์›Œ๋“œ ๋ณ„ ๋ฆฌ๋ทฐ ๋ชฉ๋ก ์กฐํšŒ (LIMIT 10) 4550 ms 10 ms 11 ms
๋ณ„์ ์ˆœ ๋ฆฌ๋ทฐ ๋ชฉ๋ก ์กฐํšŒ (LIMIT 10) 41 ms 36 ms 89 ms
๊ธ์ •์ ์ธ์ˆœ ๋ฆฌ๋ทฐ ๋ชฉ๋ก ์กฐํšŒ (LIMIT 10) 35 ms 33 ms 76 ms
์ƒํ’ˆ์˜ ๋ฆฌ๋ทฐ ํ†ต๊ณ„ ์กฐํšŒ 71 ms 70 ms 40 ms
์ƒํ’ˆ์˜ ๋ฆฌ๋ทฐํƒœ๊ทธ ํ†ต๊ณ„ ์กฐํšŒ 107 ms 188 ms 176 ms
๊ธฐ์กด ๋Œ€๋น„ ๊ฐœ์„ ์œจ   84 % 94 %

 


๋‹ค์–‘ํ•œ ์‹œํ–‰์ฐฉ์˜ค๋“ค

1. Join์œผ๋กœ Exists ์„œ๋ธŒ ์ฟผ๋ฆฌ ๋Œ€์ฒด

QueryDSL any() ๊ฐ€ ์ž๋™ ์ƒ์„ฑํ•œ where(exist(select 1 ~))) ๋ฅผ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ์ง€๋งŒ, reviewtag scan rows๊ฐ€ ๊ต‰์žฅํžˆ ํฐ ๊ฒƒ์œผ๋กœ ๋ณด์•„ ์ธ๋ฑ์Šค๋ฅผ ์ œ๋Œ€๋กœ ํƒ€์ง€ ์•Š๋Š” ๊ฒƒ์œผ๋กœ ๋ณด์ธ๋‹ค. ์„œ๋ธŒ์ฟผ๋ฆฌ ๊ตฌ์กฐ์˜ ํ•œ๊ณ„์ผ ์ˆ˜ ์žˆ๋‹ค๊ณ  ์ƒ๊ฐ๋˜์–ด, ์„œ๋ธŒ์ฟผ๋ฆฌ ๋Œ€์‹  join์„ ์ด์šฉํ•ด๋ณด์ž.

select
        distinct review0_.review_id as review_i1_8_,
        review0_.created_at as created_2_8_,
        review0_.updated_at as updated_3_8_,
        review0_.content as content4_8_,
        review0_.negative_tags_count as negative5_8_,
        review0_.polarity as polarity6_8_,
        review0_.positive_tags_count as positive7_8_,
        review0_.rating as rating8_8_,
        review0_.reservation_id as reserva10_8_,
        review0_.title as title9_8_
from
    review review0_
inner join
    review_tag reviewtags1_
        on review0_.review_id=reviewtags1_.review_id
where
    reviewtags1_.property="ROOM"
order by
    review0_.created_at desc limit 10;

14์ดˆ ์ •๋„๋กœ ์˜คํžˆ๋ ค ์„ฑ๋Šฅ์ด ์ €ํ•˜๋˜์—ˆ๋‹ค. ํŠนํžˆ ๋ฆฌ๋ทฐํƒœ๊ทธ ๋น„๊ต๊ฐ€ ๋“ค์–ด๊ฐ€์ง€ ์•Š์•„ ๊ต‰์žฅํžˆ ๋น ๋ฅด๋˜ ๊ฐ„๋‹จํ•œ ์กฐํšŒ๋“ค์ด 40์ดˆ ๋„˜๊ฒŒ ๊ฑธ๋ฆฌ๋Š” ๋“ฑ์˜ ๋ชจ์Šต์„ ๋ณด์˜€๋‹ค.

 

2. ์ธ๋ฑ์Šค ์ฃผ์˜

๊ธฐ์กด์˜ Exists ๋ฐฉ์‹์˜ where ์ ˆ ๋‚ด๋ถ€์—์„œ reviewtag์˜ review_id์™€ property๋ฅผ ์‚ฌ์šฉํ•˜๋ฏ€๋กœ ์ธ๋ฑ์Šค๋„ reviewtag index(review_id, property, keyword)๋ฅผ ์“ฐ๊ณ  ์žˆ์—ˆ๋‹ค. ํ•˜์ง€๋งŒ Join ๋ฐฉ์‹ ์ฟผ๋ฆฌ์˜ where์ ˆ์—์„œ ๋”์ด์ƒ review_id๋ฅผ ์‚ฌ์šฉํ•˜์ง€ ์•Š์Œ์—๋„ ์ธ๋ฑ์Šค๋Š” ์ •์ƒ์ ์œผ๋กœ ํƒ€์ง€๋Š”๋ฐ, ์ฟผ๋ฆฌ ์‹คํ–‰์‹œ๊ฐ„์€ ๋งค์šฐ ๊ธธ์–ด์กŒ๋‹ค.

ํ•˜์ง€๋งŒ ์ธ๋ฑ์Šค๋ฅผ reviewtag index(property, keyword)๋กœ ์ •์ƒ์ ์œผ๋กœ ๋ณ€๊ฒฝํ•˜๋ฉด, ๋งˆ์ฐฌ๊ฐ€์ง€๋กœ ๋ณ€๊ฒฝ๋œ ์ธ๋ฑ์Šค๋กœ ํƒ€์ง€๋ฉฐ ์‹คํ–‰์‹œ๊ฐ„๋„ ๊ต‰์žฅํžˆ ์งง์•„์ง„๋‹ค. ์‹คํ–‰๊ณ„ํš์œผ๋กœ ๋น„๊ตํ•˜๋ฉด reviewtag์˜ scan ๋‹จ๊ณ„์—์„œ actual rows: 18e+6 -> 800,000๋กœ ๋Œ€ํญ ๊ฐ์†Œํ–ˆ์œผ๋ฉฐ Filter ๋‹จ๊ณ„๊ฐ€ ์ƒ๋žต๋๋‹ค. ๊ธฐ์กด์˜ ์ธ๋ฑ์Šค๋Š” review_id ์ฒ˜์Œ์— ์žˆ๋˜ ํƒ“์— ํ’€์Šค์บ”์ฒ˜๋Ÿผ ๋™์ž‘ํ•˜์˜€๊ณ , ์ดํ›„ ์ธ๋ฑ์Šค๋Š” ์ •์ƒ์ ์œผ๋กœ ์ปค๋ฒ„๋ง ์ธ๋ฑ์Šค๋กœ ๋™์ž‘ํ•˜์˜€๋‹ค.

์•„๋งˆ, ์˜ตํ‹ฐ๋งˆ์ด์ €๊ฐ€ ๋น„ํšจ์œจ์ ์ธ ์ธ๋ฑ์Šค VS ํ’€์Šค์บ”์„ ๋น„๊ตํ–ˆ์„ ๋•Œ, ๊ทธ๋ž˜๋„ ๋น„ํšจ์œจ์ ์ธ ์ธ๋ฑ์Šค๊ฐ€ ๋” ํšจ์œจ์ ์ด๋ผ ์ธ๋ฑ์Šค๋ฅผ ํƒ„ ๊ฒƒ ๊ฐ™๋‹ค. ์ธ๋ฑ์Šค์˜ ์‹œ์ž‘์ ์ด ๋‹ค๋ฅด๋ฉด ์ธ๋ฑ์Šค๊ฐ€ ํƒ€์ง€์ง€ ์•Š๋Š”๋‹ค๋Š” ์ด๋ก ์ ์ธ ๋ฐฐ์›€์—๋งŒ ์˜์กดํ•ด์„œ key ๋ถ€๋ถ„๋งŒ ๋น„๊ตํ–ˆ๋Š”๋ฐ, ๋”์šฑ ๋ฉด๋ฐ€ํžˆ ์‚ดํŽด๋ณด๋ฉฐ ์ฃผ์˜ํ•ด์•ผํ•  ๊ฒƒ ๊ฐ™๋‹ค.

 

3. Distinct VS Group by

Review 1 : Reviewtag N ๊ด€๊ณ„์ด๋ฏ€๋กœ, ์ค‘๋ณต๋˜๋Š” ๋ฆฌ๋ทฐ ๋ฐ์ดํ„ฐ๋ฅผ ๋ฐฉ์ง€ํ•˜๊ธฐ ์œ„ํ•ด deduplication ๊ณผ์ •์ด ํ•„์š”ํ•˜๋‹ค. ๋ฌธ๋ฒ•์ ์œผ๋กœ ๊ฐ€์žฅ ๋ชฉ์ ์—๋„ ๋งž๊ณ  ์ผ๋ฐ˜์ ์œผ๋กœ ๋” ๋น ๋ฅธ distinct๋ฅผ ์“ฐ๊ณ  ์žˆ์—ˆ๋Š”๋ฐ, Join ๋ฐฉ์‹์˜ ์ฟผ๋ฆฌ์—์„œ๋Š” ์ง์ ‘ ์‹œ๋„ํ•ด๋ณด๋‹ˆ ๋‹ฌ๋ž๋‹ค.

distinct ์‹คํ–‰๊ณ„ํš

-> Temporary table with deduplication  (cost=2.4e+6..2.4e+6 rows=1.62e+6) (actual time=13088..13088 rows=800000 loops=1)

group by ์‹คํ–‰๊ณ„ํš

-> Temporary table with deduplication  (cost=2.4e+6..2.4e+6 rows=1.62e+6) (actual time=8587..8587 rows=800000 loops=1)

๊ด€๋ จ๋œ ์ž๋ฃŒ๋ฅผ ์•„๋ฌด๋ฆฌ ์กฐ์‚ฌํ•ด๋„ ์ผ๋ฐ˜์ ์œผ๋กœ๋Š” distinct๊ฐ€ ๋น ๋ฅด๋‹ค๋Š” ์ž๋ฃŒ ๋ฐ–์— ์ฐพ์„ ์ˆ˜ ์—†์—ˆ๋‹ค. ์‹คํ–‰ ๊ณ„ํš ์ƒ cost๋Š” ๊ฐ™์Œ์—๋„ actual time์ด ์ฐจ์ด๊ฐ€ ๋‚œ๋‹ค. ์ผ๋‹จ์€ ๋” ๋น ๋ฅธ group by๋ฅผ ์ฑ„์šฉํ•˜์˜€๋‹ค. ์‹คํ–‰ ์‹œ๊ฐ„์€ 8์ดˆ ์ •๋„์˜€๋‹ค.