The Complete Guide to Strings in Go: Basics, Manipulation Techniques, and Functions

In modern programming, string processing is an integral part. Whether it is a simple user interface or complex data processing, strings play a key role. Go language, as a modern, high-performance programming language, provides a series of powerful tools and functions for string processing.

Strings are crucial in the world of programming, whether you are processing user input or reading data from a database, you cannot do without them. The Go language provides simple and efficient tools for string processing.

Definition and characteristics of string

In Go, a string is an arbitrary collection of bytes, usually used to represent text. Strings are immutable, which means you cannot modify a character of a string, but you can generate a new string.

greeting := "Hello, Go!"
fmt.Println(greeting)

Go string immutability principle

Every string created in Go is immutable. This means you cannot directly modify the characters in the string. This design can bring some performance benefits to string operations, especially when strings are copied and passed around.

Example

package main

import "fmt"

func main() {
    original := "hello"
    modified := original

    modified = "world"

    fmt.Println(original)  // Outputs: hello
    fmt.Println(modified)  // Outputs: world
}

In the example, even though we assigned original to modified, changing modified did not affect original.
This is because strings in Go are immutable, and the change to modified created a new string in memory instead of altering the existing one.


Internal representation of Go strings

Behind a Go string is a byte array, which also means that Go can store any data, not just UTF-8 text.

Example

s := "hello"

Internally, this can be represented as:
- A byte array: [104, 101, 108, 108, 111] (These are the ASCII values for the letters h, e, l, l, o respectively)
- A pointer to the beginning of this array.
- A length of 5 indicating that there are 5 bytes in the string.


String operations and applications

Manipulating strings is part of everyday programming tasks, and the Go language provides a complete set of tools and standard library functions to make these operations simple and efficient.


String concatenation

In Go, + two or more strings can be concatenated using operators.

Example

str1 := "Hello, "
str2 := "world!"
result := str1 + str2
fmt.Println(result) // Outputs: Hello, world!

String slice

Since Go strings are byte slices behind them, you can process strings like arrays or slices to get substrings of strings.

Example

str := "Hello, World!"
// Slicing from index 7 up to, but not including, index 12
substring := str[7:12]
fmt.Println(substring) // Outputs: "World"

String search

You can easily find substrings or characters using strings functions in the package such as Contains, etc.Index

Example

package main

import (
  "fmt"
  "strings"
)

func main() {
  // The string in which we want to search
  haystack := "Hello, Golang World!"

  // The substring we want to search for
  needle := "Golang"

  // Use the strings.Contains function to check if the haystack contains the needle
  if strings.Contains(haystack, needle) {
    fmt.Println("Found:", needle)
  } else {
    fmt.Println(needle, "not found!")
  }
}

String comparison

Go provides a native way to compare two strings for equality. In addition, functions strings in the library Compare can be used to determine the lexicographic order of two strings.

Example

package main

import (
  "fmt"
)

func main() {
  str1 := "hello"
  str2 := "world"
  str3 := "hello"

  // Comparing strings
  if str1 == str2 {
    fmt.Println("str1 and str2 are equal.")
  } else {
    fmt.Println("str1 and str2 are not equal.")
  }

  if str1 == str3 {
    fmt.Println("str1 and str3 are equal.")
  } else {
    fmt.Println("str1 and str3 are not equal.")
  }
}

// Output:
// str1 and str2 are not equal.
// str1 and str3 are equal.

String Replacement

Using Replace and ReplaceAll functions you can replace substrings within a string in Go.

Example

package main

import (
  "fmt"
  "strings"
)

func main() {
  s := "banana"
  
  // Replace the first 2 occurrences of 'a' with 'o'
  r1 := strings.Replace(s, "a", "o", 2)
  fmt.Println(r1) // "bonona"

  // Replace all occurrences of 'a' with 'o'
  r2 := strings.ReplaceAll(s, "a", "o")
  fmt.Println(r2) // "bonono"
}

String Case Conversion

The Go string library provides ToUpper and ToLower functions for case conversion.

Example

str := "GoLang"
lowercase := strings.ToLower(str)
uppercase := strings.ToUpper(str)
fmt.Println(lowercase)
fmt.Println(uppercase)

Use Regular Expressions to process strings

The Go regexp library provides a series of functions to query, match, replace and split strings using regular expressions.

Example

import "regexp"

str := "My email is example@example.com"
re := regexp.MustCompile(`[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,4}`)
email := re.FindString(str)
fmt.Println(email)

String encryption and hashing

Go's crypto package provides a variety of encryption algorithms that you can use to encrypt a string or compute a hash of a string.

Example

import (
    "crypto/md5"
    "fmt"
    "io"
)

str := "secret data"
hasher := md5.New()
io.WriteString(hasher, str)
fmt.Printf("%x\n", hasher.Sum(nil))

String Splitting

Using strings.Split a function, you can split a string into slices of substrings by a specified delimiter.

Example

str := "apple,banana,cherry"
items := strings.Split(str, ",")
fmt.Println(items)

String Merging

strings.Join Function that combines a slice of strings into a single string.

Example

items := []string{"apple", "banana", "cherry"}
str := strings.Join(items, ", ")
fmt.Println(str)

Get characters in string

Each character in a string can be accessed by index, but the byte value of the character is returned.

Example

str := "Go"
byteValue := str[1]
fmt.Println(byteValue)

Traversing characters in a string

Use for range a loop to iterate over each character in a string.

Example

str := "Go"
for index, char := range str {
    fmt.Printf("At index %d, char: %c\n", index, char)
}

Trim String

strings.TrimSpace The function can remove spaces at the beginning and end of a string.

Example

str := "   Go Lang   "
trimmed := strings.TrimSpace(str)
fmt.Println(trimmed)

Padding String

Using fmt packages, you can pad or align strings using specific format modifiers.

Example

str := "Go"
padded := fmt.Sprintf("%-10s", str)
fmt.Println(padded)

String Statistics

strings.Count The function can help count the number of times a substring appears in a string.

Example

str := "Go is easy to learn. Go is powerful."
count := strings.Count(str, "Go")
fmt.Println(count)

Go string character encoding

Strings are stored and represented in computers through character encoding. In Go, strings use UTF-8 encoding by default, which means it can easily represent any Unicode character.

What is character encoding?

Character encoding is a set of rules for converting characters into numeric codes that computers can understand. Common character encodings include ASCII, ISO-8859-1, and UTF-8.

Introduction to UTF-8 encoding

UTF-8 is a variable-length Unicode character encoding method that uses 1 to 4 bytes to represent a character. It is the official recommended encoding of the Unicode standard.

Example

package main

import (
  "fmt"
)

func main() {
  // A string containing ASCII and Unicode characters
  str := "Go世界"

  fmt.Println("String:", str)

  // Print Unicode code points
  fmt.Println("Rune Code Points:")
  for _, runeValue := range str {
    fmt.Printf("%c: %U\n", runeValue, runeValue)
  }

  // Print UTF-8 encoded bytes
  fmt.Println("UTF-8 Bytes:")
  for i := 0; i < len(str); i++ {
    fmt.Printf("%c: %x\n", str[i], str[i])
  }
}

When you run this, you'll see that characters like G and o (ASCII) are represented by a single byte, whereas and (Unicode) are represented by multiple bytes in their UTF-8 encoding.


Unicode code points and rune types

A Unicode code point is a unique numerical representation of each character. In Go, rune Unicode code points can be stored and processed using types.

Example

package main

import (
  "fmt"
)

func main() {
  str := "语言"

  for _, char := range str {
    fmt.Printf("U+%04X ", char)
  }
}

When you run the program, it will print the Unicode code points for the characters in the string 语言.


String interoperability with UTF-8

You can use the len function to get the byte length of a string, but under UTF-8 encoding, you need to use it to get the number of characters utf8.RuneCountInString.

Example

str := "语言"
byteLen := len(str)
runeLen := utf8.RuneCountInString(str)
fmt.Println(byteLen)  // 6
fmt.Println(runeLen)  // 2

Decode string into rune slices

Use []rune to convert a string into a rune slice.

Example

str := "语言"
runes := []rune(str)
fmt.Println(runes)

Convert character encoding

Although Go primarily supports UTF-8, it may sometimes be necessary to interoperate with other character encodings, such as ISO-8859-1 or GBK. In this case, third-party libraries can be used, for example golang.org/x/text/encoding.

Example

import "golang.org/x/text/encoding/simplifiedchinese"
import "golang.org/x/text/transform"

str := "语言"
encoder := simplifiedchinese.GB18030.NewEncoder()
encoded, _, _ := transform.String(encoder, str)
fmt.Println(encoded)

Summary

String is a basic and indispensable data type in programming. Through this article, we have an in-depth understanding of the inner workings, operations, character encodings of strings in the Go language, and how to perform various types of conversions. These knowledge points not only demonstrate Go's powerful capabilities for string operations, but also reveal how it handles multilingual text elegantly.

From Go's design philosophy, we can see how it balances performance, security, and ease of use. The string is read-only, which makes it safe in concurrent situations. At the same time, Go uses UTF-8 as its default encoding, making global application development simple and intuitive.