Strings¶
Overview¶
A string is a sequence of characters. Strings are written between double quotation marks:
"hello"
"Scheme"
"123"
Strings are commonly used to represent textual data, such as names, messages, file contents, and user input. Like vectors, strings are indexed collections, and individual characters can be accessed by position using string-ref.
--> (define s "hello")
--> (string-ref s 1)
#\e
Internally, strings are implemented as contiguous arrays of characters at the C level. This makes indexing operations efficient and predictable. Accessing a character at a given index takes constant time, unlike lists, which require traversal.
Cozenage strings are internally stored as UTF-8 encoded byte sequences, allowing them to represent the full range of Unicode characters. This means strings can contain not only ASCII text, but accented letters, mathematical symbols, emoji, and characters from writing systems around the world. UTF-8 is a variable-length encoding, so some characters occupy more than one byte. While this detail is handled automatically by the implementation, it is important to remember that the number of bytes in a string is not necessarily the same as the number of characters. All standard string operations operate on characters rather than raw bytes, ensuring correct behavior even when multi-byte Unicode characters are present.
Example:
--> "λ"
"λ"
--> (string-length "λ")
1
Even though the character λ occupies multiple bytes in UTF-8, it is correctly treated as a single character.
Strings have a fixed length once created. Although individual characters may be modified using mutation procedures such as string-set!, the overall length of the string cannot change. To create a string of a different length, a new string must be allocated.
Strings are useful for:
Representing human-readable text
Parsing and formatting data
Storing identifiers or keys
File and network I/O
Interfacing with external systems
Because strings are sequences of characters, many procedures operate on them collectively, including substring extraction, comparison, and conversion to and from lists of characters.
It is important to remember that strings are distinct from symbols. A string represents textual data, while a symbol represents an identifier. Although they may look similar when printed, they serve different purposes and behave differently in the language.
String Procedures¶
String Constructor, Accessor, and Setter Procedures¶
string¶
- (string char ...)
Returns a newly allocated string composed of the given characters, in order. Analogous to
list. If no arguments are provided, returns an empty string.- Parameters:
char (character) – Zero or more characters to collect into a string.
- Returns:
A new string containing the given characters.
- Return type:
string
Example:
--> (string #\h #\e #\l #\l #\o) "hello" --> (string #\λ #\space #\λ) "λ λ" --> (string) ""
string-append¶
- (string-append string ...)
Returns a newly allocated string whose characters are the concatenation of the characters of all string arguments, in order. If no arguments are provided, returns an empty string. If exactly one argument is provided, a fresh copy is returned.
- Parameters:
string (string) – Zero or more strings to concatenate.
- Returns:
A new string containing all characters of all string arguments.
- Return type:
string
Example:
--> (string-append "hello" " " "world") "hello world" --> (string-append "café" " " "日本語") "café 日本語" --> (string-append "foo") "foo" --> (string-append) ""
string-ref¶
- (string-ref string k)
Returns the character at index k of string, using zero-based character indexing. For strings containing multi-byte UTF-8 characters, k is a character index, not a byte offset. Raises an error if k is negative or out of range.
- Parameters:
string (string) – The string to index into.
k (integer) – The zero-based character index of the character to retrieve.
- Returns:
The character at index k.
- Return type:
character
Example:
--> (string-ref "hello" 0) #\h --> (string-ref "hello" 4) #\o --> (string-ref "café" 3) #\é --> (string-ref "日本語" 1) #\本
make-string¶
- (make-string k [char])
Returns a newly allocated string of k characters. If char is given, every character is initialised to char. Otherwise every character is initialised to a space (
#\space, U+0020). k must be a non-negative integer.- Parameters:
k (integer) – The number of characters in the new string.
char (character) – The character to fill the string with. Defaults to
#\space.
- Returns:
A new string of k characters.
- Return type:
string
Example:
--> (make-string 5) " " --> (make-string 5 #\x) "xxxxx" --> (make-string 3 #\λ) "λλλ" --> (make-string 0) ""
substring¶
- (substring string start end)
Returns a newly allocated string formed from the characters of string beginning at index start (inclusive) and ending at index end (exclusive). start and end are character indices. Raises an error if either index is out of range or if start is greater than end.
Note that
substringis equivalent to callingstring-copywith the same arguments, and is provided for backward compatibility and stylistic flexibility.- Parameters:
string (string) – The string to extract a substring from.
start (integer) – The character index of the first character to include.
end (integer) – The character index past the last character to include.
- Returns:
A new string containing the characters of string between start and end.
- Return type:
string
Example:
--> (substring "hello" 1 3) "el" --> (substring "hello" 0 5) "hello" --> (substring "café" 2 4) "fé" --> (substring "日本語" 1 3) "本語"
string-set!¶
- (string-set! string k char)
Stores char at character index k of string, mutating it in place. Correctly handles replacement of a character with one of a different UTF-8 byte width by reallocating and shifting the underlying buffer as needed. Raises an error if k is negative or out of range. Returns an unspecified value.
- Parameters:
string (string) – The string to mutate.
k (integer) – The zero-based character index at which to store char.
char (character) – The character to store.
- Returns:
Unspecified.
Example:
--> (define s (string-copy "hello")) --> (string-set! s 0 #\H) --> s "Hello" --> (define s2 (string-copy "café")) --> (string-set! s2 3 #\e) --> s2 "cafe" --> (string-set! s2 3 #\é) --> s2 "café"
string-copy¶
- (string-copy string [start [end]])
Returns a newly allocated copy of the characters of string between start (inclusive) and end (exclusive). start and end are character indices. start defaults to
0and end defaults to the length of string. Raises an error if either index is out of range or if start is greater than end.- Parameters:
string (string) – The string to copy.
start (integer) – The index of the first character to include. Defaults to
0.end (integer) – The index past the last character to include. Defaults to the length of string.
- Returns:
A new string containing the characters of string between start and end.
- Return type:
string
Example:
--> (string-copy "hello") "hello" --> (string-copy "hello" 1) "ello" --> (string-copy "hello" 1 3) "el" --> (string-copy "café" 2) "fé"
string-copy!¶
- (string-copy! to at from [start [end]])
Copies the characters of string from between start (inclusive) and end (exclusive) into string to, starting at character index at, mutating to in place. All indices are character indices. start defaults to
0and end defaults to the length of from. If the source and destination ranges overlap, the copy is performed correctly. Returns an unspecified value.Raises an error if at is out of range for to, if start or end are out of range for from, or if the target region in to is too small to hold the copied characters.
- Parameters:
to (string) – The destination string to copy characters into.
at (integer) – The character index in to at which to begin writing.
from (string) – The source string to copy characters from.
start (integer) – The index of the first character in from to copy. Defaults to
0.end (integer) – The index past the last character in from to copy. Defaults to the length of from.
- Returns:
Unspecified.
Example:
--> (define s (string-copy "hello world")) --> (string-copy! s 6 "scheme") --> s "hello scheme" --> (define s2 (string-copy "café bar")) --> (string-copy! s2 5 "日本語" 0 2) --> s2 "café 日本"
string-fill!¶
- (string-fill! string fill [start [end]])
Stores the character fill in every position of string between start (inclusive) and end (exclusive), mutating string in place. All indices are character indices. start defaults to
0and end defaults to the length of string. Correctly handles replacement with a character of a different UTF-8 byte width by rebuilding the underlying buffer as needed. Returns an unspecified value.- Parameters:
string (string) – The string to fill.
fill (character) – The character to store in each position.
start (integer) – The index of the first character to fill. Defaults to
0.end (integer) – The index past the last character to fill. Defaults to the length of string.
- Returns:
Unspecified.
Example:
--> (define s (string-copy "hello")) --> (string-fill! s #\*) --> s "*****" --> (define s2 (string-copy "hello")) --> (string-fill! s2 #\x 1 3) --> s2 "hxxlo" --> (define s3 (string-copy "hello")) --> (string-fill! s3 #\λ 0 2) --> s3 "λλllo"
String Misc Procedures¶
string-length¶
- (string-length string)
Returns the number of characters in string as an exact integer. Note that for strings containing multi-byte UTF-8 characters, this is the number of Unicode code points, not the number of underlying bytes.
- Parameters:
string (string) – The string to measure.
- Returns:
The number of characters in string.
- Return type:
integer
Example:
--> (string-length "hello") 5 --> (string-length "") 0 --> (string-length "café") 4 --> (string-length "日本語") 3
string->list¶
- (string->list string [start [end]])
Returns a newly allocated list of the characters of string between start (inclusive) and end (exclusive). start and end are character indices. start defaults to
0and end defaults to the length of string. Raises an error if either index is out of range, or if start is greater than end.- Parameters:
string (string) – The string to convert.
start (integer) – The index of the first character to include. Defaults to
0.end (integer) – The index past the last character to include. Defaults to the length of string.
- Returns:
A new list of the characters of string between start and end.
- Return type:
list
Example:
--> (string->list "hello") (#\h #\e #\l #\l #\o) --> (string->list "hello" 1) (#\e #\l #\l #\o) --> (string->list "hello" 1 3) (#\e #\l) --> (string->list "café" 2) (#\f #\é) --> (string->list "") ()
list->string¶
- (list->string list)
Returns a newly allocated string formed from the characters in list, in order. Every element of list must be a character; an error is raised otherwise.
- Parameters:
list (list) – A proper list of characters.
- Returns:
A new string containing the characters of list.
- Return type:
string
Example:
--> (list->string '(#\h #\e #\l #\l #\o)) "hello" --> (list->string '(#\c #\a #\f #\é)) "café" --> (list->string '()) ""
string->number¶
- (string->number string [radix])
Returns the number represented by string, or
#fif string is not a valid numeric representation. radix must be one of2,8,10, or16, and defaults to10. An explicit radix prefix in string (e.g."#o177") overrides the radix argument. Unlike most error conditions, an invalid numeric string never signals an error —#fis returned instead.The result is the most precise representation available; integers, rationals, reals, and complex numbers are all returned in their native types.
- Parameters:
string (string) – A string to parse as a number.
radix (integer) – The base in which to interpret string. Must be
2,8,10, or16. Defaults to10.
- Returns:
The number represented by string, or
#fif string is not a valid numeric representation.- Return type:
number or boolean
Example:
--> (string->number "42") 42 --> (string->number "3.14") 3.14 --> (string->number "1/3") 1/3 --> (string->number "FF" 16) 255 --> (string->number "1010" 2) 10 --> (string->number "#o177") 127 --> (string->number "1+2i") 1+2i --> (string->number "hello") #f --> (string->number "") #f
number->string¶
- (number->string z [radix])
Returns a string representation of z in the given radix. radix must be one of
2,8,10, or16, and defaults to10. The result never contains an explicit radix prefix. For inexact numbers with radix10, the result contains a decimal point and uses the minimum number of digits necessary to uniquely identify the value.number->stringandstring->numberare inverses: for any number z and valid radix r,(eqv? z (string->number (number->string z r) r))is guaranteed to be#t.- Parameters:
z (number) – The number to convert.
radix (integer) – The base in which to represent z. Must be
2,8,10, or16. Defaults to10.
- Returns:
A string representation of z in the given radix.
- Return type:
string
Example:
--> (number->string 42) "42" --> (number->string 255 16) "ff" --> (number->string 10 2) "1010" --> (number->string 127 8) "177" --> (number->string 3.14) "3.14" --> (number->string 1/3) "1/3" --> (number->string 1+2i) "1+2i"
string-split¶
- (string-split string [delim])
Returns a list of substrings of string obtained by splitting on occurrences of delim. delim is a string rather than a character, allowing for multi-character delimiters. Empty substrings produced by adjacent delimiters or leading/trailing delimiters are suppressed. If delim is omitted, string is split on spaces.
Raises an error if delim is longer than string, which most likely indicates reversed argument order.
- Parameters:
string (string) – The string to split.
delim (string) – The delimiter string to split on. Defaults to
" ".
- Returns:
A list of substrings of string split by delim.
- Return type:
list
Example:
--> (string-split "hello world foo") ("hello" "world" "foo") --> (string-split "a,b,c" ",") ("a" "b" "c") --> (string-split "one::two::three" "::") ("one" "two" "three") --> (string-split " hello world ") ("hello" "world")
String Case-sensitive Comparison Procedures¶
string=?¶
- (string=? string1 string2 ...)
Returns
#tif all arguments contain the same sequence of characters,#fotherwise. Comparison is case-sensitive and based on Unicode scalar values. Zero or one argument returns#t.- Parameters:
string1 (string) – Two or more strings to compare.
- Returns:
#tif all arguments are equal,#fotherwise.- Return type:
boolean
Example:
--> (string=? "hello" "hello") #t --> (string=? "hello" "Hello") #f --> (string=? "café" "café") #t --> (string=? "a" "a" "a") #t
string<?¶
- (string<? string1 string2 ...)
Returns
#tif the arguments are monotonically increasing in lexicographic order by Unicode scalar value,#fotherwise. Zero or one argument returns#t.- Parameters:
string1 (string) – Two or more strings to compare.
- Returns:
#tif the arguments are monotonically increasing,#fotherwise.- Return type:
boolean
Example:
--> (string<? "abc" "abd") #t --> (string<? "abc" "abc") #f --> (string<? "a" "b" "c") #t
- (string<=? string1 string2 ...)
Returns
#tif the arguments are monotonically non-decreasing in lexicographic order by Unicode scalar value,#fotherwise. Zero or one argument returns#t.- Parameters:
string1 (string) – Two or more strings to compare.
- Returns:
#tif the arguments are monotonically non-decreasing,#fotherwise.- Return type:
boolean
Example:
--> (string<=? "abc" "abd") #t --> (string<=? "abc" "abc") #t --> (string<=? "abd" "abc") #f --> (string<=? "a" "a" "b") #t
- (string>? string1 string2 ...)
Returns
#tif the arguments are monotonically decreasing in lexicographic order by Unicode scalar value,#fotherwise. Zero or one argument returns#t.- Parameters:
string1 (string) – Two or more strings to compare.
- Returns:
#tif the arguments are monotonically decreasing,#fotherwise.- Return type:
boolean
Example:
--> (string>? "abd" "abc") #t --> (string>? "abc" "abc") #f --> (string>? "c" "b" "a") #t
- (string>=? string1 string2 ...)
Returns
#tif the arguments are monotonically non-increasing in lexicographic order by Unicode scalar value,#fotherwise. Zero or one argument returns#t.- Parameters:
string1 (string) – Two or more strings to compare.
- Returns:
#tif the arguments are monotonically non-increasing,#fotherwise.- Return type:
boolean
Example:
--> (string>=? "abd" "abc") #t --> (string>=? "abc" "abc") #t --> (string>=? "abc" "abd") #f --> (string>=? "c" "b" "b") #t
String Case-insensitive Comparison Procedures¶
- (string-ci=? string1 string2 ...)
Returns
#tif all arguments are equal under Unicode case-folding,#fotherwise. Equivalent to applyingstring-foldcaseto all arguments before comparing withstring=?. Zero or one argument returns#t.- Parameters:
string1 (string) – Two or more strings to compare.
- Returns:
#tif all arguments are equal ignoring case,#fotherwise.- Return type:
boolean
Example:
--> (string-ci=? "hello" "HELLO") #t --> (string-ci=? "café" "CAFÉ") #t --> (string-ci=? "hello" "world") #f --> (string-ci=? "a" "A" "a") #t
- (string-ci<? string1 string2 ...)
Returns
#tif the arguments are monotonically increasing in case-folded lexicographic order,#fotherwise. Zero or one argument returns#t.- Parameters:
string1 (string) – Two or more strings to compare.
- Returns:
#tif the arguments are monotonically increasing ignoring case,#fotherwise.- Return type:
boolean
Example:
--> (string-ci<? "abc" "ABD") #t --> (string-ci<? "ABC" "abd") #t --> (string-ci<? "abc" "ABC") #f
- (string-ci<=? string1 string2 ...)
Returns
#tif the arguments are monotonically non-decreasing in case-folded lexicographic order,#fotherwise. Zero or one argument returns#t.- Parameters:
string1 (string) – Two or more strings to compare.
- Returns:
#tif the arguments are monotonically non-decreasing ignoring case,#fotherwise.- Return type:
boolean
Example:
--> (string-ci<=? "abc" "ABD") #t --> (string-ci<=? "abc" "ABC") #t --> (string-ci<=? "ABD" "abc") #f
- (string-ci>? string1 string2 ...)
Returns
#tif the arguments are monotonically decreasing in case-folded lexicographic order,#fotherwise. Zero or one argument returns#t.- Parameters:
string1 (string) – Two or more strings to compare.
- Returns:
#tif the arguments are monotonically decreasing ignoring case,#fotherwise.- Return type:
boolean
Example:
--> (string-ci>? "ABD" "abc") #t --> (string-ci>? "ABC" "abc") #f --> (string-ci>? "C" "b" "A") #t
- (string-ci>=? string1 string2 ...)
Returns
#tif the arguments are monotonically non-increasing in case-folded lexicographic order,#fotherwise. Zero or one argument returns#t.- Parameters:
string1 (string) – Two or more strings to compare.
- Returns:
#tif the arguments are monotonically non-increasing ignoring case,#fotherwise.- Return type:
boolean
Example:
--> (string-ci>=? "ABD" "abc") #t --> (string-ci>=? "ABC" "abc") #t --> (string-ci>=? "abc" "ABD") #f --> (string-ci>=? "C" "b" "B") #t
String Case-transformation Procedures¶
- (string-upcase string)
Returns a new string with all alphabetic characters in string converted to uppercase.
- Parameters:
string (string) – The string to convert.
- Returns:
A new, uppercased string.
- Return type:
string
Example:
--> (string-upcase "Hello World") "HELLO WORLD"
- (string-downcase string)
Returns a new string with all alphabetic characters in string converted to lowercase.
- Parameters:
string (string) – The string to convert.
- Returns:
A new, downcased string.
- Return type:
string
Example:
--> (string-downcase "Hello World") "hello world"
- (string-foldcase string)
Returns a new string with all characters in string converted to their folded-case equivalents. This is the preferred method for preparing strings for case-insensitive comparison.
- Parameters:
string (string) – The string to convert.
- Returns:
A new, case-folded string.
- Return type:
string
Example:
--> (string-foldcase "Der Fluß") "der fluss"
String Iteration Procedures¶
- (string-map proc string ...)
Applies proc element-wise to the characters of each string argument and returns a new string of the resulting characters, in order. proc must accept as many arguments as there are strings and must return a character; an error is raised if it returns any other type. If more than one string is given and they differ in length, iteration stops when the shortest string is exhausted.
- Parameters:
proc (procedure) – A procedure accepting as many character arguments as there are strings, and returning a character.
string (string) – One or more strings to map over.
- Returns:
A new string of the characters returned by proc.
- Return type:
string
Example:
--> (string-map char-upcase "hello") "HELLO" --> (string-map char-downcase "CAFÉ") "café" --> (string-map (lambda (c) (if (char=? c #\a) #\@ c)) "banana") "b@n@n@" --> (string-map (lambda (a b) (if (char<? a b) a b)) "hello" "world") "hello"
- (string-for-each proc string ...)
Applies proc element-wise to the characters of each string argument, in order, for its side effects. Unlike
string-map, the results of proc are discarded. proc must accept as many arguments as there are strings. If more than one string is given and they differ in length, iteration stops when the shortest string is exhausted. Returns an unspecified value.- Parameters:
proc (procedure) – A procedure accepting as many character arguments as there are strings.
string (string) – One or more strings to iterate over.
- Returns:
Unspecified.
Example:
--> (string-for-each display "hello") hello --> (define result '()) --> (string-for-each ... (lambda (c) (set! result (cons c result))) ... "hello") --> result (#\o #\l #\l #\e #\h)