How to Base64 Encode/Decode in Python, PHP, and Node
Every backend language handles Base64 differently. Python's base64 module has multiple functions for different variants. PHP has base64_encode() and base64_decode() as builtins. Node.js uses the Buffer class with an encoding parameter. Each has its own quirks around Unicode text, binary files, URL-safe variants, and padding. This guide is a complete practical reference for Base64 encoding and decoding in all three languages — with copy-pasteable code for the most common tasks and notes on the edge cases that cause real bugs.
Base64 in Python: The base64 Module
Python's standard library includes the base64 module, which supports several encoding variants. Encoding bytes to standard Base64: import base64; encoded = base64.b64encode(b'Hello World'). This returns a bytes object: b'SGVsbG8gV29ybGQ='. Call .decode('ascii') or .decode('utf-8') to get a plain string. Decoding standard Base64 back to bytes: base64.b64decode('SGVsbG8gV29ybGQ='). Returns bytes. For text, call .decode('utf-8') on the result. Encoding a file to Base64: with open('image.jpg', 'rb') as f: encoded = base64.b64encode(f.read()). The rb mode (read binary) is critical — opening in text mode on Windows would corrupt binary data by converting line endings. Base64url encoding: base64.urlsafe_b64encode(data). This uses - and _ instead of + and /. The output still includes = padding by default. To strip it: base64.urlsafe_b64encode(data).rstrip(b'=').decode('ascii'). Decoding MIME Base64 (with line breaks): use base64.decodebytes() instead of b64decode(). It strips whitespace before decoding, handling the 76-character line wrapping in MIME email correctly. Handling missing padding: if you receive a Base64 string with stripped padding (common from JWT headers and Base64url), add padding before decoding: padded = s + '==' ; base64.b64decode(padded[:len(padded) - (len(padded) % 4 or 4)]). A simpler trick: base64.b64decode(s + '==') — Python's b64decode ignores extra padding, so adding two extra equals signs is always safe.
Base64 in PHP: Built-in Functions
PHP provides Base64 encoding and decoding as built-in functions, making it one of the simplest languages for this operation. Encoding: base64_encode($data). Works on strings, which in PHP are sequences of bytes — there is no distinction between text and binary at the language level. The function returns the Base64-encoded string including any required = padding. Decoding: base64_decode($encodedString). Returns the decoded bytes as a PHP string, or false on failure if the input is not valid Base64. Always check for false return: if ($decoded === false) { /* handle error */ }. Base64url encoding: PHP does not have a built-in Base64url function. Use: strtr(base64_encode($data), '+/', '-_') to encode, and base64_decode(strtr($input, '-_', '+/')) to decode. For padding on decode, add: base64_decode(str_pad(strtr($input, '-_', '+/'), strlen($input) + (4 - strlen($input) % 4) % 4, '=')). For JWT payload inspection in PHP: $parts = explode('.', $jwt); $payload = json_decode(base64_decode(strtr($parts[1], '-_', '+/')), true). This splits the JWT, takes the payload section, converts Base64url to standard Base64, decodes it, and parses the JSON in one chain. A practical gotcha in PHP: base64_decode() with the second parameter strict set to true ($data, true) will return false for any input containing characters outside the Base64 alphabet (including whitespace). Without strict mode, PHP silently ignores invalid characters, which can mask bugs. For security-sensitive decoding, always use strict mode.
Base64 in Node.js: The Buffer Class
Node.js handles Base64 through the Buffer class, which was designed for binary data handling before TypedArrays were available in JavaScript. Encoding text to Base64: Buffer.from('Hello World', 'utf8').toString('base64'). Returns the string 'SGVsbG8gV29ybGQ='. Encoding a file to Base64: const fs = require('fs'); const encoded = fs.readFileSync('image.jpg').toString('base64'). Reading the file returns a Buffer, and calling .toString('base64') encodes it. Decoding Base64 to text: Buffer.from('SGVsbG8gV29ybGQ=', 'base64').toString('utf8'). Returns 'Hello World'. Decoding Base64 to a file: const buffer = Buffer.from(base64String, 'base64'); fs.writeFileSync('output.pdf', buffer). The Buffer is written as raw binary bytes. Base64url in Node.js 16+: Buffer.from(data).toString('base64url') encodes with - and _, no padding. Buffer.from(str, 'base64url') decodes. For older Node.js versions, use the manual strtr approach from the PHP section. Async file encoding in modern Node.js: const { readFile } = require('fs/promises'); const buffer = await readFile('file.pdf'); const encoded = buffer.toString('base64'). Deno and edge runtimes: Deno uses the Web Crypto API and TextEncoder/TextDecoder. btoa() and atob() are available as globals, but they only handle binary strings correctly — use TextEncoder to convert UTF-8 text to bytes before calling btoa().
Cross-Language Compatibility and Common Pitfalls
When Base64 data is generated by one language and consumed by another — an API backend in Python sending Base64 data to a JavaScript frontend, for example — compatibility issues arise from subtle differences in handling. Padding differences: Python's base64 module always adds = padding. Node.js Buffer does too. PHP's base64_encode() does too. But Base64url implementations often strip padding. If your JavaScript frontend receives a padding-stripped Base64url string from a Python backend, you need to add the padding back before decoding with atob(). Line breaks: Python's base64.encodebytes() adds a newline after every 76 characters (MIME-style). base64.b64encode() does not. If you use encodebytes() for API output and the receiver is a JavaScript atob() call, the line breaks will cause an error. Use b64encode() for API payloads, encodebytes() only for MIME/email. Encoding detection: there is no reliable way to detect whether a string is standard Base64 or Base64url from the string itself alone, because both use the same 62 alphanumeric characters. The presence of + or / indicates standard Base64; the presence of - or _ indicates Base64url. If neither appear, both decoders will produce identical output. Unicode strings: all three languages handle Unicode differently at the Base64 layer. In Python, you must encode a str to bytes before calling b64encode() — Base64 operates on bytes, not strings. In PHP, strings are bytes, so there is no distinction. In Node.js, specify the source encoding explicitly in Buffer.from(str, encoding) to avoid using the default encoding (usually latin1 in older versions). When in doubt, always encode to UTF-8 bytes explicitly before Base64-encoding text.
Frequently Asked Questions
- Why does Python base64.b64encode() return bytes instead of a string?
- Because Base64 encoding in Python operates on bytes objects, not strings. The distinction matters in Python 3, where str and bytes are strictly separate types. The encoded Base64 output is a bytes object containing ASCII characters. To use it as a Python string — for JSON serialization, printing, or insertion into a URL — call .decode('ascii') or .decode('utf-8') on the result. The ASCII characters used by Base64 (A-Z, a-z, 0-9, +, /, =) are identical in UTF-8 and ASCII, so either decoding works.
- What is the easiest way to decode a JWT in Python or PHP?
- In Python: split the JWT on dots, take the second element (payload), replace - with + and _ with /, add == padding, then call base64.b64decode() and json.loads(). One line: import base64, json; payload = json.loads(base64.b64decode(jwt.split('.')[1] + '==')); In PHP: $payload = json_decode(base64_decode(strtr(explode('.', $jwt)[1], '-_', '+/')), true). These decode only — they do not validate the signature. Use a proper JWT library for production code that actually validates signatures.
- Can I use the same Base64 string across Python, PHP, and Node.js without modification?
- Standard Base64 strings are fully interoperable across all languages and platforms. A string produced by Python's b64encode(), PHP's base64_encode(), or Node.js Buffer .toString('base64') can be decoded correctly by any of the others without modification. Issues only arise with variants: Base64url strings may need the - and _ characters substituted before decoding with standard decoders, and MIME Base64 strings with line breaks need whitespace stripped. Within standard Base64 — same characters, same padding — full interoperability is guaranteed.