Strings in Go Programming Language
Strings in Go
A string
is a sequence of one or more characters (letters, numbers, symbols) that can be either a constant or a variable.
Strings are defined between double quotes "..."
and not single quotes, unlike JavaScript. You can also create strings within back quotes ``` (sometimes referred to as back ticks). Depending on which quotes you use, the string will have different behaviour or characteristics.
In Go language, the string is an immutable chain of arbitrary bytes(characters) encoded with UTF-8 encoding.
Learn different Data Types in Go if you are not familiar with it first, before proceding further.
`
... `` `
Strings using back quote Using back quotes, as in `bar`
, will create a raw string literal. In a raw string literal, any character may appear between quotes, with the exception of back quotes. Here’s an example of a raw string literal-
package main
import "fmt"
func main() {
s := `Welcome to "DebugPointer"!`
fmt.Println(s)
}
Output of the program-
Welcome to "DebugPointer"!
Program exited.
This way can be used to create multi-line string literals-
package main
import "fmt"
func main() {
s := `
Welcome
to
"DebugPointer"!`
fmt.Println(s)
}
Output of the program-
Welcome
to
"DebugPointer"!
Program exited.
Strings using double quotes "..."
Strings created using double quotes are called as the Interpreted string literals. Example -"bar"
. Within the quotes, any set of characters can be used excluding newline and unescaped double quotes.
Some Examples-
package main
import "fmt"
func main() {
s := "Welcome to \"DebugPointer\"!" // this is valid as doubles quotes are escaped
t := "Hello, Thank you for the payment of $200.00"
fmt.Println(s)
fmt.Println(t)
}
Output of the program-
Welcome to "DebugPointer"!
Hello, Thank you for the payment of $200.00
Program exited.
Though I mentioned that Strings can only be represented within double quotes or back quotes, a single character can be represented within single quotes. Ex-
'z'
or '7' or '*'
Length of a String in Go
To find the length of a string, you can use len
function. The len
function is available in Go runtime, hence you don’t need to import it from any package.
package main
import "fmt"
func main() {
s := "Welcome to DebugPointer"
fmt.Println(len(s))
}
In the above program, len(s)
will print 23
to the console as the string s
has 23
characters(characters include spaces).
Output of the program-
23
Program exited.
Strings are Immutable in Go
In Go, a string is a read-only slice of bytes. When we use len
function on a string, it calculates the length of that slice
. Strings are immutable in Go. So, if you want to change or modify the value of a specific charater in a string, it is not possible to do so. You will get a compile error which states - cannot assign to s[0]
. Here is an example that demonstrates immutable strings in Go-
package main
import "fmt"
func main() {
s := "DebugPointer"
s[0] = 'P'
fmt.Println(s)
}
Output of the program-
./prog.go:8:7: cannot assign to s[0]
Go build failed.
Code Unit and Code Point
The number of bits an encoding uses for one single unit cell is called as a code unit. So UTF-8 uses 8 bits and UTF-16 uses 16 bits for a code unit
, that means UTF-8 needs minimum 8 bits or 1 byte to represent a character.
A code point
is any numerical value that defines the character and this is represented by one or more code units depending on the encoding. As UTF-8 is compatible with ASCII, all ASCII characters are represented in a single byte (8 bits), hence UTF-8 needs only 1 code unit to represent them.
For Loop in Go
When we use for
loop, it loops around the slice returning one byte at a time or one code unit
at a time. As so far, all our characters were in the ASCII character set, the byte provided by for loop was a valid character or a code unit was, in fact, a code point.
Here is an example of the same-
package main
import "fmt"
func main() {
s := "Welcome to DebugPointer"
for index, char := range s {
fmt.Printf("character at index %d is %c\n", index, char)
}
}
Output of the Program- Observe the space character being printed as well - index 7 and index 10.
character at index 0 is W
character at index 1 is e
character at index 2 is l
character at index 3 is c
character at index 4 is o
character at index 5 is m
character at index 6 is e
character at index 7 is
character at index 8 is t
character at index 9 is o
character at index 10 is
character at index 11 is D
character at index 12 is e
character at index 13 is b
character at index 14 is u
character at index 15 is g
character at index 16 is P
character at index 17 is o
character at index 18 is i
character at index 19 is n
character at index 20 is t
character at index 21 is e
character at index 22 is r
Program exited.
Rune
As said earlier a String is simply a slice of bytes (or uint8
integers). When we run a foor-loop with a range what we get is a rune
, as each character in the string is represented by rune
data type. rune
is just an alias to int32
data type. To freshen your knowledge on data types, check - Data Types in Go
Reminder, though I mentioned that Strings can only be represented within double quotes or back quotes, a single character can be represented within single quotes. Ex- 'z'
package main
import "fmt"
func main() {
r := 'a'
fmt.Printf("Hexa - %x \n", r)
fmt.Printf("ASCII - %v \n", r)
fmt.Printf("Type - %T", r)
}
Output-
Hexa - 61
ASCII - 97
Type - int32
Program exited
As you can see in the above program, each chacter can be represented as - Hexa Decimal or ASCII. We can also extract it's type.