Data types
Q is designed for speed with very large data sets. Loose datatyping would compromise efficiency, as primitives silently cast from one type to another. Automatic type conversion is minimised by fine-grained data types.
Atoms
An atom is the smallest unit of data. An atom cannot be indexed.
There are 18 types of data atoms.
boolean 0b
guid 52cb20d9-f12c-9963-2829-3c64d8d8cb14
byte 0x00
short 0h
int 0i
long 0j, 0
real 0e
float 0.0, 0f
char " "
symbol `
timestamp 2023.03.16D11:51:26.923000000
month 2000.01m
date 2000.01.01
datetime 2023.03.16T11:52:07.320
timespan 00:00:00.000000000
minute 00:00
second 00:00:00
time 00:00:00.000
The type
keyword returns the type of its argument as a short.
A negative sign indicates an atom.
q)type each (3;3.14159;"q";`q;2023.03m)
-7 -9 -10 -11 -13h
True and False
Booleans 10b
are True and False, but all datatypes are truthy.
Any zero value is False; others are True.
q)"cfjmpu"$/:0 1
"\000" 0f 0 2000.01m 2000.01.01D00:00:00.000000000 00:00
"\001" 1f 1 2000.02m 2000.01.01D00:00:00.000000001 00:01
q)not "cfjmpu"$/:0 1
111111b
000000b
Nulls
Every datatype except boolean has its own null value.
Taking the first item of an empty vector returns the null for that datatype.
q)show now:.z.d+.z.t
2023.10.09D21:15:57.508000000
q)first each 0#'(1;now;`abc;"abc")
0N
0Np
`
" "
Tok returns a null from an empty string – except for booleans, of course.
q)"J"$""
0N
q)1 null\"B"$""
00b
q)"G"$""
00000000-0000-0000-0000-000000000000
The null
keyword indicates null values.
q)null first each 0#'(1;now;`abc;"abc";1b)
11110b
In noun syntax the Identity operator ::
denotes the general null.
Infinities
The min
and max
of an empty vector return the positive and negative infinities of its datatype.
q)(min;max)@\:0#99h / short
0W -0Wh
q)(min;max)@\:0#99i / int
0W -0Wi
q)(min;max)@\:0#99. / float
0w -0w
q)(min;max)@\:0#now / timestamp
0W -0Wp
Casting infinities
An infinity becomes a finite number when cast to a broader datatype.
This can surprise you if, for example, an infinity is promoted for insertion into a vector of broader type.
q)"j"$(min;max)@\:0#99h
32767 -32767
q)@[til 10;3 7;:;] "j"$99 0Wh
0 1 2 99 4 5 6 32767 8 9
Strings and symbols
There is no String datatype.
What qbists call a string is a vector of characters. (See Data Structures.)
The closest thing in q to an immutable string is a symbol.
A symbol is written as a backtick followed by zero or more characters.
It is used to enumerate repeated strings such as stock codes.
`goog / Google
`ibm / IBM
`msft / Microsoft
A symbol atom displays as text, but it is an enumeration: the underlying data is an integer index into a master list of strings called the symlist.
Literal and display forms
The literal form of a data type is how you write its atoms. Its display form is how the interpreter writes its atoms.
As far as is practical, literal and display forms are identical.
Not always:
- The
j
suffix is optional in the literal form of longs and omitted in the display form. - A decimal point in a float makes an
f
suffix optional in the literal and omitted in the display. - GUIDs have no literal form.
- Display forms of primitives defined in k, such as
til
, are their k definitions.
Type suffixes are omitted in the displays of arrays.
Casting between datatypes
The Cast operator $
casts data between types.
Its left argument is the target datatype, as a short, char, or symbol from the table of datatypes.
q)9h$999
999f
q)"j"$3.14159
3
q)`byte$"q"
0x71
The string
keyword returns a string representation of an atom.
q)"c"$3.14159 / cast to char
"\003"
q)string 3.14159 / string representation
"3.14159"
q)string `ibm
"ibm"
q)string 2000.01.01
"2000.01.01"
Temporal data
All temporal datatypes have underlying numeric representations.
Dot notation is a convenience for temporal data.
q)show now:.z.d+.z.t
2023.10.09D18:31:07.007000000
q)(now.minute;now.second;now.month)
18:31
18:31:07
2023.10m
Functions are atoms but they are not data atoms
Functions are first-class objects in q, so they too are atoms. But they are not data atoms.
Functions include operators, keywords, projections, compositions, and lambdas. Each has its own datatype.
q)type each (+;2*;type;+/;{x*y+z})
102 104 101 107 100h
q)/operator;projection;keyword;derived function;lambda
Exercises
No peeking.
Attempt each exercise before looking at its answer.
The exercises clarify your thinking. Reading answers does not.
-
Review the Datatypes reference. Investigate anything unfamiliar.
-
List everything you learned from the previous exercise.
-
Distinguish between Tok, Cast and
string
: write a definition of each.Answer
Cast is atomic: it casts each atom of
y
to typex
.Tok is string-atomic: it interprets each string in
y
as an atom of typex
. (x
is an upper-case letter.)string
is atomic: it represents each atom of its argument as a string. -
What is the relationship between Tok and
string
? Applystring
to a string. What does it return, and why does it not return its argument?Answer
Tok and
string
are inverses of each other. Tok interprets strings as data atoms;string
represents data atoms as strings.q)string "string" ,"s" ,"t" ,"r" ,"i" ,"n" ,"g"
Above,
string
represents each atom of"string"
as a string, returning six 1-character strings.Casting a string to char is a no-op.
q)"c"$"string" "string"
-
Show that Cast is left-atomic.
Answer
q)(type'')("h";("ij");"f")$4 -5h -6 -7h -9h
Above, the left argument of Cast is a nested list. Obeying scalar extension, Cast iterates it over its atom right argument.
-
What datatype does not have a literal form? How could you write one of its atoms?
-
List all the temporal datatypes with their units.