ȸ¿ø°¡ÀԡžÆÀ̵ð/ºñ¹øã±â
ȨÀ¸·Î

ÆÄÀ̽㿡¼­ À¯´ÏÄÚµå ½ºÆ®¸² ´Ù·ç±â
8³â Àü
ÆÄÀ̽㿡¼­ À¯´ÏÄڵ带 ´Ù·ê ¶§´Â ÀϹÝÀûÀ¸·Î str.decode()¿Í unicode.encode() ¸Þ¼­µå¸¦ »ç¿ëÇÏ¿© unicode ŸÀÔ°ú str ŸÀÔÀ» »óÈ£ º¯È¯ÇÑ´Ù.

¾Æ·¡ ¿¹½Ã¿¡¼­´Â 'utf-16'À¸·Î ÀÛ¼ºµÈ ÆÄÀÏÀ» ¿­¾î, ¼öÁ÷ ÅÇ(vertical tab) ÄÚµåÆ÷ÀÎÆ®¸¦ Áö¿î ´ÙÀ½, 'utf-8'·Î ÀúÀåÇÑ´Ù. (±úÁø XMLÀ» ´Ù·ê ¶§ ÀÌ ¹æ½ÄÀÌ ¸Å¿ì Áß¿äÇÏ´Ù.)

# ÆÄÀÏ ³»¿ëÀ» Àд´Ù
with open("input.txt", "rb") as input:  
    data = input.read()

# ¹ÙÀ̳ʸ® µ¥ÀÌÅ͸¦ utf-16À¸·Î µðÄÚµùÇÑ´Ù
data = data.decode("utf-16")

# ¼öÁ÷ ÅÇÀ» »èÁ¦ÇÑ´Ù
data = data.replace(u"\u000B", u"")

# À¯´ÏÄÚµå µ¥ÀÌÅ͸¦ utf-8·Î ÀÎÄÚµùÇÑ´Ù
data = data.encode("utf-8")

# µ¥ÀÌÅ͸¦ utf-8·Î ÀúÀåÇÑ´Ù
with open("output.txt", "wb") as output:  
    output.write(data)
¾öû³ª°Ô Å« ÆÄÀÏÀ» ´Ù·ê ¶§°¡ ¾Æ´Ï¶ó¸é ÀÌ Á¤µµ·Îµµ ÃæºÐÇÏ´Ù. ÇÏÁö¸¸ Å« ÆÄÀÏÀ» ´Ù·ê ¶© ¸ðµç µ¥ÀÌÅÍ°¡ ¸Þ¸ð¸®¿¡ ¿Ã¶ó°£´Ù´Â »ç½ÇÀÌ ¹®Á¦°¡ µÈ´Ù.

½ºÆ®¸®¹Ö ÀÎÄÚ´õ/µðÄÚ´õ »ç¿ëÇϱâ
ÆÄÀ̽㠱⺻ ¶óÀ̺귯¸®¿¡´Â codecs ¸ðµâÀÌ Æ÷ÇԵǾî ÀÖ´Ù. ÀÌ ¸ðµâÀ» »ç¿ëÇϸé ÆÄÀÏÀ» Á¶±Ý¾¿ ÀÐÀ» ¼ö ÀÖ°í, ¸Þ¸ð¸®¿¡µµ ¾à°£ÀÇ À¯´ÏÄÚµå µ¥ÀÌÅ͸¸ ¿Ã¶ó°¡°Ô µÈ´Ù.

codecs.open() ÇïÆÛ ¸Þ¼­µå¸¦ »ç¿ëÇÏ¿© À§ÀÇ ¿¹½Ã¸¦ ÃÖ¼ÒÇѸ¸ °íÃĺ¸ÀÚ.

import codecs

# ÀÔ·Â ½ºÆ®¸²°ú Ãâ·Â ½ºÆ®¸²À» ¿¬´Ù
input = codecs.open("input.txt", "rb", encoding="utf-16")  
output = codecs.open("output.txt", "wb", encoding="utf-8")

# À¯´ÏÄÚµå µ¥ÀÌÅÍ Á¶°¢µéÀ» ½ºÆ®¸®¹ÖÇÑ´Ù
with input, output:  
    while True:
        # µ¥ÀÌÅÍ Á¶°¢À» Àаí
        chunk = input.read(4096)
        if not chunk:
            break
        # ¼öÁ÷ ÅÇÀ» »èÁ¦ÇÑ´Ù
        chunk = chunk.replace(u"\u000B", u"")
        # µ¥ÀÌÅÍ Á¶°¢À» ¾´´Ù
        output.write(chunk)
ÆÄÀÏÀº ²ûÂïÇØ! ÀÌÅÍ·¹ÀÌÅÍ »ç¿ëÇϱâ
ÆÄÀÏÀº ´Ù·ç±â°¡ Á» Áö·çÇÏ´Ù. º¹ÀâÇÑ Ã³¸® °úÁ¤¿¡´Â À¯´ÏÄÚµå µ¥ÀÌÅÍÀÇ ÀÌÅÍ·¹ÀÌÅ͸¦ ´Ù·ç´Â ÆíÀÌ ±ò²ûÇÒ °ÍÀÌ´Ù.

¾Æ·¡´Â iterdecode()¸¦ »ç¿ëÇÏ¿©, ÆÄÀÏÀ» À¯´ÏÄÚµå µ¥ÀÌÅÍ Á¶°¢ÀÇ ÀÌÅÍ·¹ÀÌÅÍ·Î Àд ȿ°úÀûÀÎ ¹æ¹ýÀÌ´Ù.

from functools import partial  
from codecs import iterdecode

# ƯÁ¤ pathÀÇ ÆÄÀÏÀ» À¯´ÏÄÚµå Á¶°¢ÀÇ ÀÌÅÍ·¹ÀÌÅÍ·Î ¸®ÅÏÇÑ´Ù
def iter_unicode_chunks(path, encoding):  
    # ÀÐÀ» ÆÄÀÏÀ» ¿¬´Ù
    with open(path, "rb") as input:
        # ¹ÙÀ̳ʸ® ÆÄÀÏÀ» ¹ÙÀ̳ʸ® Á¶°¢À¸·Î º¯È¯ÇÑ´Ù
        binary_chunks = iter(partial(input.read, 1), "")
        # ¹ÙÀ̳ʸ® Á¶°¢À» À¯´ÏÄÚµå Á¶°¢À¸·Î º¯È¯ÇÑ´Ù
        for unicode_chunk in iterdecode(binary_chunks, encoding):
            yield unicode_chunk
ÀÌÁ¦ iterencode() ¸Þ¼­µå¸¦ »ç¿ëÇÏ¿©, À¯´ÏÄÚµå Á¶°¢ÀÇ ÀÌÅÍ·¹ÀÌÅ͸¦ ÆÄÀÏ¿¡ ½áº¸ÀÚ.

from codecs import iterencode

# À¯´ÏÄÚµå Á¶°¢ÀÇ ÀÌÅÍ·¹ÀÌÅ͸¦ ƯÁ¤ pathÀÇ ÆÄÀÏ¿¡ ¾´´Ù
def write_unicode_chunks(path, unicode_chunks, encoding):  
    # ¾µ ÆÄÀÏÀ» ¿¬´Ù
    with open(path, "wb") as output:
        # À¯´ÏÄÚµå Á¶°¢À» ¹ÙÀ̳ʸ®·Î º¯È¯ÇÑ´Ù
        for binary_chunk in iterencode(unicode_chunks, encoding):
            output.write(binary_chunk)
ÀÌ µÎ ÇÔ¼ö¿Í ÇÔ²² À¯´ÏÄÚµå µ¥ÀÌÅÍÀÇ ½ºÆ®¸²¿¡¼­ ¼öÁ÷ ÅÇÀ» ¾ø¾Ö´Â ÀÏÀÌ ¸¶¹ý °°ÀÌ ³¡³­´Ù(just becomes a case of plumbing everything together).

# ÆÄÀÏÀ» À¯´ÏÄÚµå Á¶°¢ ÇüÅ·ΠÀд´Ù
unicode_chunks = iter_unicode_chunks("input.txt", encoding="utf-16")

# À¯´ÏÄÚµå Á¶°¢À» ¼öÁ¤ÇÑ´Ù
unicode_chunks = (  
    chunk.replace(u"\u000B", u"")
    for chunk
    in unicode_chunks
)

# À¯´ÏÄÚµå Á¶°¢À» ÆÄÀÏ¿¡ ÀúÀåÇÑ´Ù
write_unicode_chunks("output.txt", unicode_chunks, encoding="utf-8")  
°ÅâÇÏ°Ô codecs ¸ðµâÀ» »ç¿ëÇØ¾ß ÇÒ±î?
¾óÇÍ ±×³É, str.decode()¿Í unicode.encode() ¸Þ¼­µå¸¦ »ç¿ëÇÏ¿© Å« file °´Ã¼¸¦ ¹ÙÀ̳ʸ® Á¶°¢À¸·Î Àаí, ÀÎÄÚµùÇÏ°í µðÄÚµùÇÏ´Â ÆíÀÌ °£´ÜÇÏ´Ù°í »ý°¢ÇÒ ¼öµµ ÀÖ°Ú´Ù.

# ³ª»Û ¿¹½Ã. ÀÌ·¸°Ô ÇÏÁö ¸¶½Ã¿À!

# ÀÔ·Â ½ºÆ®¸²°ú Ãâ·Â ½ºÆ®¸²À» ¿¬´Ù
with open("input.txt", "rb") as input, open("output.txt", "wb") as output:  
    # ¹ÙÀ̳ʸ® µ¥ÀÌÅÍ Á¶°¢µéÀ» ¼øȸÇÑ´Ù
    while True:
        # µ¥ÀÌÅÍ Á¶°¢À» Àд´Ù
        chunk = input.read(4096)
        if not chunk:
            break
        # À§Çè: ¹ÙÀ̳ʸ® µ¥ÀÌÅ͸¦ utf-16À¸·Î µðÄÚµùÇÑ´Ù
        chunk = chunk.decode("utf-16")
        # ¼öÁ÷ ÅÇÀ» »èÁ¦ÇÑ´Ù
        chunk = chunk.replace(u"\u000B", u"")
        # À¯´ÏÄÚµå µ¥ÀÌÅ͸¦ utf-8·Î ÀÎÄÚµùÇÑ´Ù
        chunk = chunk.encode("utf-8")
        # µ¥ÀÌÅÍ Á¶°¢À» ¾´´Ù
        output.write(chunk)
ºÒÇàÈ÷µµ ¸î¸î À¯´ÏÄÚµå ÄÚµåÆ÷ÀÎÆ®´Â ¹ÙÀ̳ʸ® µ¥ÀÌÅÍÀÇ ÇÑ ¹ÙÀÌÆ® ÀÌ»óÀ¸·Î ÀÎÄÚµùµÈ´Ù. µû¶ó¼­ ´Ü¼øÈ÷ ÆÄÀÏ¿¡¼­ ¹ÙÀÌÆ® Á¶°¢µéÀ» Àо decode() ¸Þ¼­µå¸¦ Àû¿ëÇÏ¸é ¿¹±âÄ¡ ¾Ê°Ô UnicodeDecodeError°¡ ¹ß»ýÇÒ ¼öµµ ÀÖ´Ù. ÀÌ´Â ¹ÙÀÌÆ® ÇÑ Á¶°¢ÀÌ ¿©·¯ ¹ÙÀÌÆ®ÀÇ ÄÚµåÆ÷ÀÎÆ®·Î ºÐ¸®µÇ¾ú±â ¶§¹®ÀÌ´Ù.

codecs ¸ðµâÀÇ µµ±¸µéÀ» »ç¿ëÇϸé ÀÌ·¯ÇÑ ¿¹±âÄ¡ ¾ÊÀº Ãæµ¹À» ¿¹¹æÇÒ ¼ö ÀÖ´Ù.

ÆÄÀ̽ã 3¿¡¼­´Â?
ÆÄÀ̽ã 3¿¡¼­´Â ÈξÀ ´Ü¼øÇÏ°Ô À¯´ÏÄÚµå ÆÄÀÏÀ» ´Ù·ê ¼ö ÀÖ´Ù. ºôÆ®ÀÎ ¸Þ¼­µåÀÎ open()Àº À¯´ÏÄÚµå µ¥ÀÌÅ͸¦ ¼öÁ¤Çϰųª ÀÎÄÚµùÀ» º¯°æÇÏ´Â µ¥ ÇÊ¿äÇÑ ±â´ÉÀ» Æ÷ÇÔÇÏ°í ÀÖ´Ù.

# ÀÔ·Â ½ºÆ®¸²°ú Ãâ·Â ½ºÆ®¸²À» ¿¬´Ù
input = open("input.txt", "rt", encoding="utf-16")  
output = open("output.txt", "wt", encoding="utf-8")

# À¯´ÏÄÚµå µ¥ÀÌÅÍ Á¶°¢µéÀ» ½ºÆ®¸®¹ÖÇÑ´Ù
with input, output:  
    while True:
        # µ¥ÀÌÅÍ Á¶°¢À» Àаí
        chunk = input.read(4096)
        if not chunk:
            break
        # ¼öÁ÷ ÅÇÀ» »èÁ¦ÇÑ´Ù
        chunk = chunk.replace("\u000B", "")
        # µ¥ÀÌÅÍ Á¶°¢À» ¾´´Ù
        output.write(chunk)
ÆÄÀ̽ã 3ÀÇ ½Ã´ë´Ù! Áñ°Ì°Ô ÄÚµùÇϱæ!
ÃßõÃßõ : 330 Ãßõ ¸ñ·Ï
¹øÈ£ Á¦¸ñ
2,765
ÀÚ¹Ù½ºÅ©¸³Æ® ¿¬½À ¿¹Á¦ - ·£´ýÇÏ°Ô ¿òÁ÷ÀÌ´Â ÅؽºÆ® ¸¸µé±â
2,764
jQuery Mobile¿¡¼­ À¯¿ëÇÑ ÄÚµå 10°¡Áö
2,763
ÀÚ¹Ù½ºÅ©¸³Æ®·Î º¹»çÇϱ⠱â´É ±¸ÇöÇϱâ
2,762
[JqueryMobile] - Jquery Mobile È­¸é °­Á¦ ȸÀü
2,761
jQuery – ½º¸¶Æ®Æù ½ºÅ©¸° ȸÀü °¨Áö ½ºÅ©¸³Æ®
2,760
javascript audioÆÄÀÏ Àç»ý
2,759
javascript º¯¼ö scope shadowing Ŭ·ÎÀú
2,758
javascript ¹Ýº¹¹®
2,757
javascript Á¶°Ç¹®
2,756
javascript ¹è¿­
2,755
javascript ¿¬»êÀÚ¿Í ÇÔ¼ö
2,754
javascript ÀÚ·áÇü
2,753
javascript º¯¼ö
2,752
[jQuery] animate()¿¡¼­ stop()ÀÇ Á߿伺
2,751
javascript audioÆÄÀÏ Àç»ý
2,750
[jQuery.$ajax]mysql + php ¼­¹ö¿¡¼­ json µ¥ÀÌÅÍ ¾ò¾î¿À±â
2,749
Daum¿¡¼­ »ç¿ëÇÏ°í ÀÖ´Â ¸¶¿ì½º ¿À¸¥ÂÊ »ç¿ë ±ÝÁö ½ºÅ©¸³Æ®
2,748
MP3 ¸µÅ©½Ã ½ºÆ®¸®¹Ö µÇÁö ¾Ê°í ´Ù¿î·Îµå µÇµµ·Ï ¸¸µé±â
2,747
jQuery ºñµ¿±â Ajax Åë½ÅÀ» Çغ¸ÀÚ!! - JSON¹æ½Ä
2,746
[jQuery] load ¸¦ ÀÌ¿ëÇÑ ½Ç½Ã°£ ÆäÀÌÁö °¡Á®¿À±â
2,745
[jQuery]ºñµ¿±âÀûÀ¸·Î JSONÆÄÀÏ ·Îµå
2,744
[jQuery]µ¿ÀûÀ¸·Î ¿ø°Ý ÆäÀÌÁö ·Îµå
2,743
[jQuery]¿öÅ͸¶Å© ÅؽºÆ®¹Ú½º ÄÁÆ®·Ñ ±¸Çö
2,742
[jQuery]Æ®¸®ºä ÄÁÆ®·Ñ
2,741
[jQuery]¾Ö´Ï¸ÞÀÌ¼Ç È¿°ú ¸ØÃß±â - stop()
2,740
[jQuery]¿©·¯°¡Áö È¿°ú µ¿½Ã ó¸® - animate()
2,739
[jQuery]½½¶óÀÌµå ¾÷
2,738
[jQuery]½½¶óÀÌµå ¾÷/´Ù¿î - slideToggle()
2,737
[jQuery] scroll down °ú scroll up ÇÒ¶§ À̺¥Æ® Àû¿ë½ÃÅ°±â
2,736
[jQuery]addClass() : ½ºÅ¸ÀÏ Ãß°¡ [removeClass()]
¸ñ·Ï
¹ÂÁ÷Æ®·ÎÆ® ºÎ»ê±¤¿ª½Ã ºÎ»êÁø±¸ °¡¾ßµ¿ ¤Ó °³ÀÎÁ¤º¸Ãë±Þ¹æħ
Copyright ¨Ï musictrot All rights reserved.