Login With Github

Python 3 Strings: String Definition, Operators and Formattings

Strings are one of the most basic data types in Python, used to represent textual data. Almost every application involves working with strings.

Define a String

We can create them simply by enclosing characters in quotes. Python treats single quotes the same as double quotes. Creating strings is as simple as assigning a value to a variable. For example:

string_1  = "Example string #1"
string_2 = 'Example string #2'

Both methods are equivalent. If a string is delimited with double quotes, any double quotation marks within the string will need to be escaped with a backslash (\):

"My teacher said \"Don't forget your homework.\""

Similarly, in single-quoted strings you will need to escape any apostrophes or single-quoted expressions:

'This is Linode\'s documentation site.'

Accessing Values in Strings

Python does not support a character type; these are treated as strings of length one, thus also considered a substring.

To access substrings, use the square brackets for slicing along with the index or indices to obtain your substring. For example:

#!/usr/bin/python3

var1 = 'Hello World!'
var2 = "Python Programming"

print ("var1[0]: ", var1[0])
print ("var2[1:5]: ", var2[1:5])

When the above code is executed, it produces the following result:

var1[0]:  H
var2[1:5]:  ytho

String Operators

The + and * operators are overridden for the string class, making it possible to add and multiply strings. Strings in Python are immutable. They cannot be modified after being created.

Using the add operator to combine strings is called concatenation. The strings for first and last name remain unchanged. Concatenating the two strings returns a new string.

Assume string variable a holds 'Hello' and variable b holds 'Python', then

Operator Description Example
+ Concatenation - Adds values on either side of the operator a + b will give HelloPython
* Repetition - Creates new strings, concatenating multiple copies of the same string a*2 will give HelloHello
[] Slice - Gives the character from the given index a[1] will give e
[ : ] Range Slice - Gives the characters from the given range a[1:4] will give ell
in Membership - Returns true if a character exists in the given string H in a will give 1
not in Membership - Returns true if a character does not exist in the given string M not in a will give 1
r/R Raw String - Suppresses actual meaning of Escape characters. The syntax for raw strings is exactly the same as for normal strings with the exception of the raw string operator, the letter "r," which precedes the quotation marks. The "r" can be lowercase (r) or uppercase (R) and must be placed immediately preceding the first quote mark. print r'\n' prints \n and print R'\n'prints \n
% Format - Performs String formatting See at next section

String Formatting Operator

Before Python 3.6, you had two main ways of embedding Python expressions inside string literals for formatting: %-formatting and str.format(). You’re about to see how to use them and what their limitations are.

Option #1: %-formatting

This is the OG of Python formatting and has been in the language since the very beginning. You can read more in the Python docs. Keep in mind that %-formatting is not recommended by the docs, which contain the following note:

“The formatting operations described here exhibit a variety of quirks that lead to a number of common errors (such as failing to display tuples and dictionaries correctly).

Using the newer formatted string literals or the str.format() interface helps avoid these errors. These alternatives also provide more powerful, flexible and extensible approaches to formatting text.” (Source)

String objects have a built-in operation using the % operator, which you can use to format strings. Here’s what that looks like in practice:

>>> name = "Eric"
>>> "Hello, %s." % name
'Hello, Eric.'

In order to insert more than one variable, you must use a tuple of those variables. Here’s how you would do that:

>>> name = "Eric"
>>> age = 74
>>> "Hello, %s. You are %s." % (name, age)
'Hello Eric. You are 74.'

Here is the list of complete set of symbols which can be used along with %

Operator Name Description
%c character
%s string conversion via str() prior to formatting
%i signed decimal integer
%d signed decimal integer
%u unsigned decimal integer
%o octal integer
%x hexadecimal integer (lowercase letters)
%X hexadecimal integer (UPPERcase letters)
%e exponential notation (with lowercase 'e')
%E exponential notation (with UPPERcase 'E')
%f floating point real number
%g the shorter of %f and %e
%G the shorter of %f and %E

The code examples that you just saw above are readable enough. However, once you start using several parameters and longer strings, your code will quickly become much less easily readable. Things are starting to look a little messy already:

>>> first_name = "Eric"
>>> last_name = "Idle"
>>> age = 74
>>> profession = "comedian"
>>> affiliation = "Monty Python"
>>> "Hello, %s %s. You are %s. You are a %s. You were a member of %s." % (first_name, last_name, age, profession, affiliation)
'Hello, Eric Idle. You are 74. You are a comedian. You were a member of Monty Python.'

Unfortunately, this kind of formatting isn’t great because it is verbose and leads to errors, like not displaying tuples or dictionaries correctly. Fortunately, there are brighter days ahead.

Option #2: str.format()

This newer way of getting the job done was introduced in Python 2.6. You can check out the Python docs for more info.

str.format() is an improvement on %-formatting. It uses normal function call syntax and is extensible through the __format__() method on the object being converted to a string.

With str.format(), the replacement fields are marked by curly braces:

>>> "Hello, {}. You are {}.".format(name, age)
'Hello, Eric. You are 74.'

You can reference variables in any order by referencing their index:

>>> "Hello, {1}. You are {0}.".format(age, name)
'Hello, Eric. You are 74.'

But if you insert the variable names, you get the added perk of being able to pass objects and then reference parameters and methods in between the braces:

>>> person = {'name': 'Eric', 'age': 74}
>>> "Hello, {name}. You are {age}.".format(name=person['name'], age=person['age'])
'Hello, Eric. You are 74.'

You can also use ** to do this neat trick with dictionaries:

>>> person = {'name': 'Eric', 'age': 74}
>>> "Hello, {name}. You are {age}.".format(**person)
'Hello, Eric. You are 74.'

str.format() is definitely an upgrade when compared with %-formatting, but it’s not all roses and sunshine.

Code using str.format() is much more easily readable than code using %-formatting, but str.format() can still be quite verbose when you are dealing with multiple parameters and longer strings. Take a look at this:

>>> first_name = "Eric"
>>> last_name = "Idle"
>>> age = 74
>>> profession = "comedian"
>>> affiliation = "Monty Python"
>>> print(("Hello, {first_name} {last_name}. You are {age}. " + 
>>>        "You are a {profession}. You were a member of {affiliation}.") \
>>>        .format(first_name=first_name, last_name=last_name, age=age, \
>>>                profession=profession, affiliation=affiliation))
'Hello, Eric Idle. You are 74. You are a comedian. You were a member of Monty Python.'

If you had the variables you wanted to pass to .format() in a dictionary, then you could just unpack it with .format(**some_dict) and reference the values by key in the string, but there has got to be a better way to do this.

Option #3: f-Strings

Python 3.6 introduced a simpler way to format strings: formatted string literals, usually referred to as f-strings.

The syntax is similar to the one you used with str.format() but less verbose. Look at how easily readable this is:

>>> name = "Eric"
>>> age = 74
>>> f"Hello, {name}. You are {age}."
'Hello, Eric. You are 74.'

It would also be valid to use a capital letter F:

>>> F"Hello, {name}. You are {age}."
'Hello, Eric. You are 74.'

Any Python expression can be placed inside the brackets in an f-string, giving them even more flexibility:

>>> orders = [14.99,19.99,10]
>>> f'You have {len(orders)} items in your cart, for a total cost of ${sum(orders)}.

You can have multiline strings:

>>> name = "Eric"
>>> profession = "comedian"
>>> affiliation = "Monty Python"
>>> message = (
...     f"Hi {name}. "
...     f"You are a {profession}. "
...     f"You were in {affiliation}."
... )
>>> message
'Hi Eric. You are a comedian. You were in Monty Python.'

But remember that you need to place an f in front of each line of a multiline string.

Escape Characters

Following table is a list of escape or non-printable characters that can be represented with backslash notation.

An escape character gets interpreted; in a single quoted as well as double quoted strings.

Backslash notation Hexadecimal character Description
\a 0x07 Bell or alert
\b 0x08 Backspace
\cx Control-x
\C-x Control-x
\e 0x1b Escape
\f 0x0c Formfeed
\M-\C-x Meta-Control-x
\n 0x0a Newline
\nnn Octal notation, where n is in the range 0.7
\r 0x0d Carriage return
\s 0x20 Space
\t 0x09 Tab
\v 0x0b Vertical tab
\x Character x
\xnn Hexadecimal notation, where n is in the range 0.9, a.f, or A.F

0 Comment

temp