How to iterate over multiline strings in Go
In this article, we will discuss the best approach to iterating over each line of a newline delimited string (such as the contents of text files) in Go
If you need to iterate over each line of a string in Go (such as each line of a
file), you can use the bufio.Scanner
type which provides a convenient
interface for reading data such as lines of text from any source. The way to
create a Scanner from a multiline string is
by using the bufio.NewScanner()
method which takes in any type that implements
the io.Reader interface as its only
argument.
If you want to read a file line by line, you can call os.Open()
on the
file name and pass the resulting os.File
to NewScanner()
since it implements
io.Reader
. You can also cause an ordinary string to implement the io.Reader
interface by calling the strings.NewReader()
method on it.
Once you have a Scanner, you can call the Scan()
method to advance to the next
line which can be accessed through the Text()
method. Scan()
returns a
boolean each time it is called. It will return false
if there are no more
lines.
Here’s an example that reads an HTML file line by line in Go. Here’s the file in question:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<meta http-equiv="X-UA-Compatible" content="ie=edge" />
<title>Document</title>
</head>
<body>
<script></script>
</body>
</html>
And here’s the code to iterate over each line in the file:
package main
import (
"bufio"
"fmt"
"log"
"os"
)
func main() {
file, err := os.Open("index.html")
if err != nil {
log.Fatal(err)
}
scanner := bufio.NewScanner(file)
counter := 1
for scanner.Scan() {
line := scanner.Text()
fmt.Printf("Line %02d -> %s\n", counter, line)
counter++
}
if scanner.Err() != nil {
log.Println(scanner.Err())
}
}
Line 01 -> <!DOCTYPE html>
Line 02 -> <html lang="en">
Line 03 -> <head>
Line 04 -> <meta charset="UTF-8" />
Line 05 -> <meta name="viewport" content="width=device-width, initial-scale=1.0" />
Line 06 -> <meta http-equiv="X-UA-Compatible" content="ie=edge" />
Line 07 -> <title>Document</title>
Line 08 -> </head>
Line 09 -> <body>
Line 10 -> <script></script>
Line 11 -> </body>
Line 12 -> </html>
The for
loop above will continue until the Scan()
method returns false. Each
time Scan()
is invoked, the next line is accessed using scanner.Text()
and
printed to the standard output. Errors must be handled by checking if
scanner.Err()
is not equal to nil
. It returns the first non-EOF error that was encountered by the Scanner.
Thanks for reading, and happy coding!