Convert binary to ASCII and vice versa

61

30

Using this code to take a string and convert it to binary:

bin(reduce(lambda x, y: 256*x+y, (ord(c) for c in 'hello'), 0))

this outputs:

0b110100001100101011011000110110001101111

Which, if I put it into this site (on the right hand site) I get my message of hello back. I'm wondering what method it uses. I know I could splice apart the string of binary into 8's and then match it to the corresponding value to bin(ord(character)) or some other way. Really looking for something simpler.

sbrichards

Posted 2011-09-13T04:34:14.173

Reputation: 1 129

1

related: b2a_bin extension in Cython allows to create binary strings ("01") directly from bytestrings without creating an intermediate Python integer.

– jfs – 2013-11-16T05:59:54.790

1So is your question, "is there a more succinct way to do the inverse of my code than the obvious"? – tripleee – 2011-09-13T05:10:37.393

Answers

127

For ASCII characters in the range [ -~] on Python 2:

>>> import binascii
>>> bin(int(binascii.hexlify('hello'), 16))
'0b110100001100101011011000110110001101111'

In reverse:

>>> n = int('0b110100001100101011011000110110001101111', 2)
>>> binascii.unhexlify('%x' % n)
'hello'

In Python 3.2+:

>>> bin(int.from_bytes('hello'.encode(), 'big'))
'0b110100001100101011011000110110001101111'

In reverse:

>>> n = int('0b110100001100101011011000110110001101111', 2)
>>> n.to_bytes((n.bit_length() + 7) // 8, 'big').decode()
'hello'

To support all Unicode characters in Python 3:

def text_to_bits(text, encoding='utf-8', errors='surrogatepass'):
    bits = bin(int.from_bytes(text.encode(encoding, errors), 'big'))[2:]
    return bits.zfill(8 * ((len(bits) + 7) // 8))

def text_from_bits(bits, encoding='utf-8', errors='surrogatepass'):
    n = int(bits, 2)
    return n.to_bytes((n.bit_length() + 7) // 8, 'big').decode(encoding, errors) or '\0'

Here's single-source Python 2/3 compatible version:

import binascii

def text_to_bits(text, encoding='utf-8', errors='surrogatepass'):
    bits = bin(int(binascii.hexlify(text.encode(encoding, errors)), 16))[2:]
    return bits.zfill(8 * ((len(bits) + 7) // 8))

def text_from_bits(bits, encoding='utf-8', errors='surrogatepass'):
    n = int(bits, 2)
    return int2bytes(n).decode(encoding, errors)

def int2bytes(i):
    hex_string = '%x' % i
    n = len(hex_string)
    return binascii.unhexlify(hex_string.zfill(n + (n & 1)))

Example

>>> text_to_bits('hello')
'0110100001100101011011000110110001101111'
>>> text_from_bits('110100001100101011011000110110001101111') == u'hello'
True

jfs

Posted 2011-09-13T04:34:14.173

Reputation: 258 659

3@J.F.Sebastian i tried this method with the python current version and it seems that it does not work. <br/> TypeError: 'str' does not support the buffer interface <br/> Would you update your answer – hamza – 2012-11-13T19:30:23.257

3@hamza: It works on Python 2. On Python 3 you should convert str to bytes first e.g., your_string.encode('ascii', 'strict') – jfs – 2012-11-13T19:32:23.213

1@J.F.Sebasitian: thanks, however when i tried it vice versa the unhexlify funtion return an error message: binascii.Error: Odd-length string. – hamza – 2012-11-14T12:28:06.300

3@hamza: prepend it with '0' if hex-string's length is not even. It happens if the first character in the original string has ascii code less than 16 e.g., '\n' or '\t'. Odd-length never happens for ascii letters [ -~]. – jfs – 2012-11-14T15:59:10.300

2@J.F.Sebastian: i can't thank you enough. – hamza – 2012-11-14T20:01:12.250

1This is exactly what I needed actually, thanks! – sbrichards – 2011-09-13T23:59:36.180

10

Built-in only python

Here is a pure python method for simple strings, left here for posterity.

def string2bits(s=''):
    return [bin(ord(x))[2:].zfill(8) for x in s]

def bits2string(b=None):
    return ''.join([chr(int(x, 2)) for x in b])

s = 'Hello, World!'
b = string2bits(s)
s2 = bits2string(b)

print 'String:'
print s

print '\nList of Bits:'
for x in b:
    print x

print '\nString:'
print s2

String:
Hello, World!

List of Bits:
01001000
01100101
01101100
01101100
01101111
00101100
00100000
01010111
01101111
01110010
01101100
01100100
00100001

String:
Hello, World!

tmthydvnprt

Posted 2011-09-13T04:34:14.173

Reputation: 5 135

8

I'm not sure how you think you can do it other than character-by-character -- it's inherently a character-by-character operation. There is certainly code out there to do this for you, but there is no "simpler" way than doing it character-by-character.

First, you need to strip the 0b prefix, and left-zero-pad the string so it's length is divisible by 8, to make dividing the bitstring up into characters easy:

bitstring = bitstring[2:]
bitstring = -len(bitstring) % 8 * '0' + bitstring

Then you divide the string up into blocks of eight binary digits, convert them to ASCII characters, and join them back into a string:

string_blocks = (bitstring[i:i+8] for i in range(0, len(bitstring), 8))
string = ''.join(chr(int(char, 2)) for char in string_blocks)

If you actually want to treat it as a number, you still have to account for the fact that the leftmost character will be at most seven digits long if you want to go left-to-right instead of right-to-left.

agf

Posted 2011-09-13T04:34:14.173

Reputation: 110 160

Good answer. Whee! – jathanism – 2011-09-13T05:30:23.583

2

This is my way to solve your task:

str = "0b110100001100101011011000110110001101111"
str = "0" + str[2:]
message = ""
while str != "":
    i = chr(int(str[:8], 2))
    message = message + i
    str = str[8:]
print message

Minh Triet Pham Tran

Posted 2011-09-13T04:34:14.173

Reputation: 294

Why you are adding '0' at str = "0" + str[2:] ?. 0b is needed to remove here because it is beginning. – bimlesh sharma – 2013-10-05T19:53:12.260

1

if you don'y want to import any files you can use this:

with open("Test1.txt", "r") as File1:
St = (' '.join(format(ord(x), 'b') for x in File1.read()))
StrList = St.split(" ")

to convert a text file to binary.

and you can use this to convert it back to string:

StrOrgList = StrOrgMsg.split(" ")


for StrValue in StrOrgList:
    if(StrValue != ""):
        StrMsg += chr(int(str(StrValue),2))
print(StrMsg)

hope that is helpful, i've used this with some custom encryption to send over TCP.

Kyle Burns

Posted 2011-09-13T04:34:14.173

Reputation: 88

1

Are you looking for the code to do it or understanding the algorithm?

Does this do what you need? Specifically a2b_uu and b2a_uu? There are LOTS of other options in there in case those aren't what you want.

(NOTE: Not a Python guy but this seemed like an obvious answer)

Jaxidian

Posted 2011-09-13T04:34:14.173

Reputation: 8 215

@Jaxidian that was quite helpful for my purposes. Someone stored some data in a string and I have it. I am quite sure it's a 64binary b/c of the padding. I can successfully use b2a_base64 on that, however the result is, indeed, confusing at best. How do I get a list of boolleans/integers (0,1) from there? – Ufos – 2018-02-15T17:15:26.927

I've been researching it for a bit, binascii isn't working for me, and mostly looking for the code, if I can see it I can understand it. Thanks though EDIT: when converting ascii to binary using binascii a2b_uu for "h" is \x00\x00\x00\x00\x00\x00\x00\x00 which is not what I need, I need 'hello' and actual 1's and 0's not shellcode looking ascii, also it only works char by char – sbrichards – 2011-09-13T04:43:22.893

-1

This is a spruced up version of J.F. Sebastian's. Thanks for the snippets though J.F. Sebastian.

import binascii, sys
def goodbye():
    sys.exit("\n"+"*"*43+"\n\nGood Bye! Come use again!\n\n"+"*"*43+"")
while __name__=='__main__':
    print "[A]scii to Binary, [B]inary to Ascii, or [E]xit:"
    var1=raw_input('>>> ')
    if var1=='a':
        string=raw_input('String to convert:\n>>> ')
        convert=bin(int(binascii.hexlify(string), 16))
        i=2
        truebin=[]
        while i!=len(convert):
            truebin.append(convert[i])
            i=i+1
        convert=''.join(truebin)
        print '\n'+'*'*84+'\n\n'+convert+'\n\n'+'*'*84+'\n'
    if var1=='b':
        binary=raw_input('Binary to convert:\n>>> ')
        n = int(binary, 2)
        done=binascii.unhexlify('%x' % n)
        print '\n'+'*'*84+'\n\n'+done+'\n\n'+'*'*84+'\n'
    if var1=='e':
        aus=raw_input('Are you sure? (y/n)\n>>> ')
        if aus=='y':
            goodbye()

TH33L1T3

Posted 2011-09-13T04:34:14.173

Reputation: 7