PostgreSQL 9.3.4 文档 | ||||
---|---|---|---|---|
Prev | Up | Chapter 35. 扩展 SQL | Next |
SQL 函数执行一个由任意 SQL 语句构成的列表,返回列表中最后一个查询的结果。在简单(非集合)的情况中,最后一个查询的结果的第一行将被返回(记住一个多行结果的"第一行"不是良定义的,除非你使用ORDER BY)。如果最后一个查询正好根本不返回行,将会返回空值。
或者,一个 SQL 函数可以通过指定函数的返回类型为SETOF sometype被声明为返回一个集合(也就是多个行),或者等效地声明它为RETURNS TABLE(columns)。在这种情况下,最后一个查询的结果的所有行会被返回。下文将给出进一步的细节。
一个 SQL 函数的主体必须是一个由分号分隔的 SQL 语句的列表。最后一个语句之后的分号是可选的。除非函数被声明为返回void,最后一个语句必须是一个SELECT或者一个带有RETURNING子句的INSERT、UPDATE或者DELETE。
SQL语言中的任何命令集合都能被打包在一起并且被定义成一个函数。除了SELECT查询,命令可以包括数据修改查询(INSERT、UPDATE以及DELETE)和其他 SQL 命令(你不能在SQL函数中使用事务控制命令,例如COMMIT、SAVEPOINT,以及一些工具命令,例如VACUUM)。不过,最后一个命令必须是一个SELECT或者带有一个RETURNING子句,该命令必须返回符合函数返回类型的数据。或者,如果你想要定义一个执行动作但是不返回有用的值的函数,你可以把它定义为返回void。例如,这个函数从emp表中移除具有负值薪水的行:
CREATE FUNCTION clean_emp() RETURNS void AS ' DELETE FROM emp WHERE salary < 0; ' LANGUAGE SQL; SELECT clean_emp(); clean_emp ----------- (1 row)
CREATE FUNCTION命令的语法要求函数体被写作一个字符串常量。使用用于字符串常量的美元引用通常最方便(见Section 4.1.2.4)。你过你选择使用常规的单引号引用的字符串常量语法,你必须在函数体中双写单引号(')和反斜线(\)(假定转义字符串语法)(见Section 4.1.2.1)。
一个 SQL 函数的参数可以在函数体中用名称或编号引用。下面会有两种方法的例子。
要使用一个名称,将函数参数声明为带有一个名称,然后在函数体中只写该名称。如果参数名称与函数内当前 SQL 命令中的任意列名相同,列名将优先。如果不想这样,可以用函数本身的名称来限定参数名,也就是function_name.argument_name(如果这会与一个被限定的列名冲突,照例还是列名赢得优先。你可以通过为 SQL 命令中的表选择一个不同的别名来避免这种混淆)。
在更旧的数字方法中,参数可以用语法$n引用:$1指的是第一个输入参数,$2指的是第二个,以此类推。不管特定的参数是否使用名称声明,这种方法都有效。
如果一个参数是一种组合类型,那么点号记法(如argname.fieldname或$1.fieldname)也可以被用来访问该参数的属性。同样,你可能需要用函数的名称来限定参数的名称以避免歧义。
SQL 函数参数只能被用做数据值而不能作为标识符。例如这是合理的:
INSERT INTO mytable VALUES ($1);
但这样就不行:
INSERT INTO $1 VALUES (42);
Note: 使用名称来引用 SQL 函数参数的能力是在PostgreSQL 9.2 中加入的。要在老的服务器中使用的函数必须使用$n记法。
最简单的SQL函数没有参数并且简单地返回一个基本类型,例如integer:
CREATE FUNCTION one() RETURNS integer AS $$ SELECT 1 AS result; $$ LANGUAGE SQL; -- Alternative syntax for string literal: CREATE FUNCTION one() RETURNS integer AS ' SELECT 1 AS result; ' LANGUAGE SQL; SELECT one(); one ----- 1
注意我们为该函数的结果在函数体内定义了一个列别名(名为result),但是这个列别名在函数以外是不可见的。因此,结果被标记为one而不是result。
定义用基本类型作为参数的SQL函数也很容易:
CREATE FUNCTION add_em(x integer, y integer) RETURNS integer AS $$ SELECT x + y; $$ LANGUAGE SQL; SELECT add_em(1, 2) AS answer; answer -------- 3
我们也能省掉参数的名称而使用数字:
CREATE FUNCTION add_em(integer, integer) RETURNS integer AS $$ SELECT $1 + $2; $$ LANGUAGE SQL; SELECT add_em(1, 2) AS answer; answer -------- 3
这里是一个更有用的函数,它可以被用来借记一个银行账号:
CREATE FUNCTION tf1 (accountno integer, debit numeric) RETURNS integer AS $$ UPDATE bank SET balance = balance - debit WHERE accountno = tf1.accountno; SELECT 1; $$ LANGUAGE SQL;
一个用户可以这样执行这个函数来从账户 17 中借记 $100.00:
SELECT tf1(17, 100.0);
在这个例子中,我们为第一个参数选择了名称accountno,但是这和表bank中的一个列名相同。 在UPDATE命令中, accountno引用列bank.accountno,因此 tf1.accountno必须被用来引用该参数。 我们当然可以通过为该参数使用一个不同的名称来避免这样的问题。
实际上我们可能喜欢从该函数得到一个更有用的结果而不是一个常数 1,因此一个更可能的定义是:
CREATE FUNCTION tf1 (accountno integer, debit numeric) RETURNS integer AS $$ UPDATE bank SET balance = balance - debit WHERE accountno = tf1.accountno; SELECT balance FROM bank WHERE accountno = tf1.accountno; $$ LANGUAGE SQL;
它会调整余额并且返回新的余额。 同样的事情也可以用一个使用RETURNING的命令实现:
CREATE FUNCTION tf1 (accountno integer, debit numeric) RETURNS integer AS $$ UPDATE bank SET balance = balance - debit WHERE accountno = tf1.accountno RETURNING balance; $$ LANGUAGE SQL;
在编写使用组合类型参数的函数时,我们必须不仅指定我们想要哪些参数,还要指定参数的期望属性(域)。例如,假定
emp是一个包含雇员数据的表,并且因此它也是该表每一行的组合类型的名称。
这里是一个函数double_salary
,它计算某个人的双倍薪水:
CREATE TABLE emp ( name text, salary numeric, age integer, cubicle point ); INSERT INTO emp VALUES ('Bill', 4200, 45, '(2,1)'); CREATE FUNCTION double_salary(emp) RETURNS numeric AS $$ SELECT $1.salary * 2 AS salary; $$ LANGUAGE SQL; SELECT name, double_salary(emp.*) AS dream FROM emp WHERE emp.cubicle ~= point '(2,1)'; name | dream ------+------- Bill | 8400
注意语法$1.salary的使用是要选择参数行值的一个域。 还要注意调用的SELECT命令是如何使用* 来选择一个表的整个当前行作为一个组合值的。该表行也可以只用表名来引用:
SELECT name, double_salary(emp) AS dream FROM emp WHERE emp.cubicle ~= point '(2,1)';
但这种用法已被废弃因为它很容易被搞混。
有时候实时构建一个组合参数很方便。这可以用ROW结构完成。 例如,我们可以调整被传递给函数的数据:
SELECT name, double_salary(ROW(name, salary*1.1, age, cubicle)) AS dream FROM emp;
也可以构建一个返回组合类型的函数。这是一个返回单一emp行的函数例子:
CREATE FUNCTION new_emp() RETURNS emp AS $$ SELECT text 'None' AS name, 1000.0 AS salary, 25 AS age, point '(2,2)' AS cubicle; $$ LANGUAGE SQL;
在这个例子中,我们为每一个属性指定了一个常量值,但是可以用任何计算来替换这些常量。
有关定义函数有两件重要的事情:
查询中的选择列表顺序必须和列在与组合类型相关的表中出现的顺序完全相同(如我们上面所作的,列的命名与系统无关)。
你必须把表达式类型转换以匹配组合类型的定义,否则你将得到这样的错误:
ERROR: function declared to return emp returns varchar instead of text at column 1
定义同样的函数的一种不同的方法是:
CREATE FUNCTION new_emp() RETURNS emp AS $$ SELECT ROW('None', 1000.0, 25, '(2,2)')::emp; $$ LANGUAGE SQL;
这里我们写了一个只返回正确组合类型的单一列的SELECT。 在这种情况下这种写法实际并非更好,但是它在一些情况下比较方便 — 例如,我们需要通过调用另一个返回所期望的组合值的函数来计算结果。
我们可以在两种方式下直接调用这个函数:
SELECT new_emp(); new_emp -------------------------- (None,1000.0,25,"(2,2)") SELECT * FROM new_emp(); name | salary | age | cubicle ------+--------+-----+--------- None | 1000.0 | 25 | (2,2)
第二种方式在Section 35.4.7中有更完全的描述。
当你使用一个返回组合类型的函数时,你可能只想要其结果中的一个域(属性)。 你可以这样做:
SELECT (new_emp()).name; name ------ None
额外的圆括号是必须的,它用于避免解析器被搞混。如果你不写这些括号,会这样:
SELECT new_emp().name; ERROR: syntax error at or near "." LINE 1: SELECT new_emp().name; ^
另一个选项是使用函数记号来抽取一个属性。简而言之就是我们可以交换使用 attribute(table)和table.attribute。
SELECT name(new_emp()); name ------ None
-- 这等同于: -- SELECT emp.name AS youngster FROM emp WHERE emp.age < 30; SELECT name(emp) AS youngster FROM emp WHERE age(emp) < 30; youngster ----------- Sam Andy
Tip: 函数记号和属性记号之间的等效性使得可以在组合类型上使用函数来模拟"计算域"。 例如,使用double_salary(emp)之前的定义,我们可以写
SELECT emp.name, emp.double_salary FROM emp;使用上述语句的应用不会需要直接知道double_salary不是表的一个真实列 (也能在视图内模拟计算列)。
由于这种行为,给出一个接受单一组合类型参数并且参数与组合类型中任一域同名的函数是不明智的。
另一种使用返回组合类型的函数的方法是把结果传递给另一个接收正确行类型作为输入的函数:
CREATE FUNCTION getname(emp) RETURNS text AS $$ SELECT $1.name; $$ LANGUAGE SQL; SELECT getname(new_emp()); getname --------- None (1 row)
还有另一种使用返回组合类型的函数的方法是把它作为一个表函数调用,可参考Section 35.4.7。
一种描述一个函数的结果的替代方法是定义它的输出参数,例如:
CREATE FUNCTION add_em (IN x int, IN y int, OUT sum int) AS 'SELECT x + y' LANGUAGE SQL; SELECT add_em(3,7); add_em -------- 10 (1 row)
这和Section 35.4.2中展示的add_em版本没有本质上的不同。输出参数的真正价值是它们提供了一种方便的方法来定义返回多个列的函数。例如,
CREATE FUNCTION sum_n_product (x int, y int, OUT sum int, OUT product int) AS 'SELECT x + y, x * y' LANGUAGE SQL; SELECT * FROM sum_n_product(11,42); sum | product -----+--------- 53 | 462 (1 row)
这里实际发生的是我们为该函数的结果创建了一个匿名的组合类型。上述例子具有与下面相同的最终结果
CREATE TYPE sum_prod AS (sum int, product int); CREATE FUNCTION sum_n_product (int, int) RETURNS sum_prod AS 'SELECT $1 + $2, $1 * $2' LANGUAGE SQL;
但是不必单独定义组合类型常常很方便。注意输出参数的名称并非只是装饰,而且决定了匿名组合类型的列名(如果你为一个输出参数忽略了名称,系统将自行选择一个名称)。
在从 SQL 调用这样一个函数时,输出参数不会被包括在调用参数列表中。这是因为PostgreSQL只考虑输入参数来定义函数的调用签名。这也意味着在为诸如删除函数等目的引用该函数时只有输入参数有关系。我们可以用下面的命令之一删除上述函数
DROP FUNCTION sum_n_product (x int, y int, OUT sum int, OUT product int); DROP FUNCTION sum_n_product (int, int);
参数可以被标记为IN(默认)、OUT、INOUT或者VARIADIC。一个INOUT参数既作为一个输入参数(调用参数列表的一部分)又作为一个输出参数(结果记录类型的一部分)。VARIADIC参数是输入参数,但被按照后文所述特殊对待。
只要"可选的"参数都是相同的数据类型,SQL函数可以被声明为接受可变数量的参数。可选的参数将被作为一个数组传递给该函数。声明该函数时要把最后一个参数标记为VARIADIC,这个参数必须被声明为一个数组类型,例如:
CREATE FUNCTION mleast(VARIADIC arr numeric[]) RETURNS numeric AS $$ SELECT min($1[i]) FROM generate_subscripts($1, 1) g(i); $$ LANGUAGE SQL; SELECT mleast(10, -1, 5, 4.4); mleast -------- -1 (1 row)
实际上,所有位于或者超过VARIADIC位置的实参会被收集成一个一位数组,就好像你写了:
SELECT mleast(ARRAY[10, -1, 5, 4.4]); -- 不起作用
但是你实际无法这样写 — 或者说至少它将无法匹配这个函数定义。一个被标记为VARIADIC的参数匹配其元素类型的一次或者多次出现,而不是它自身类型的出现。
Sometimes it is useful to be able to pass an already-constructed array to a variadic function; this is particularly handy when one variadic function wants to pass on its array parameter to another one. You can do that by specifying VARIADIC in the call:
SELECT mleast(VARIADIC ARRAY[10, -1, 5, 4.4]);
This prevents expansion of the function's variadic parameter into its element type, thereby allowing the array argument value to match normally. VARIADIC can only be attached to the last actual argument of a function call.
The array element parameters generated from a variadic parameter are treated as not having any names of their own. This means it is not possible to call a variadic function using named arguments (Section 4.3), except when you specify VARIADIC. For example, this will work:
SELECT mleast(VARIADIC arr := ARRAY[10, -1, 5, 4.4]);
but not these:
SELECT mleast(arr := 10); SELECT mleast(arr := ARRAY[10, -1, 5, 4.4]);
Functions can be declared with default values for some or all input arguments. The default values are inserted whenever the function is called with insufficiently many actual arguments. Since arguments can only be omitted from the end of the actual argument list, all parameters after a parameter with a default value have to have default values as well. (Although the use of named argument notation could allow this restriction to be relaxed, it's still enforced so that positional argument notation works sensibly.)
例如:
CREATE FUNCTION foo(a int, b int DEFAULT 2, c int DEFAULT 3) RETURNS int LANGUAGE SQL AS $$ SELECT $1 + $2 + $3; $$; SELECT foo(10, 20, 30); foo ----- 60 (1 row) SELECT foo(10, 20); foo ----- 33 (1 row) SELECT foo(10); foo ----- 15 (1 row) SELECT foo(); -- fails since there is no default for the first argument ERROR: function foo() does not exist
The = sign can also be used in place of the key word DEFAULT.
All SQL functions can be used in the FROM clause of a query, but it is particularly useful for functions returning composite types. If the function is defined to return a base type, the table function produces a one-column table. If the function is defined to return a composite type, the table function produces a column for each attribute of the composite type.
Here is 一个例子:
CREATE TABLE foo (fooid int, foosubid int, fooname text); INSERT INTO foo VALUES (1, 1, 'Joe'); INSERT INTO foo VALUES (1, 2, 'Ed'); INSERT INTO foo VALUES (2, 1, 'Mary'); CREATE FUNCTION getfoo(int) RETURNS foo AS $$ SELECT * FROM foo WHERE fooid = $1; $$ LANGUAGE SQL; SELECT *, upper(fooname) FROM getfoo(1) AS t1; fooid | foosubid | fooname | upper -------+----------+---------+------- 1 | 1 | Joe | JOE (1 row)
As the example shows, we can work with the columns of the function's result just the same as if they were columns of a regular table.
Note that we only got one row out of the function. This is because we did not use SETOF. That is described in the next section.
When an SQL function is declared as returning SETOF sometype, the function's final query is executed to completion, and each row it outputs is returned as an element of the result set.
This feature is normally used when calling the function in the FROM clause. In this case each row returned by the function becomes a row of the table seen by the query. For example, assume that table foo has the same contents as above, and we say:
CREATE FUNCTION getfoo(int) RETURNS SETOF foo AS $$ SELECT * FROM foo WHERE fooid = $1; $$ LANGUAGE SQL; SELECT * FROM getfoo(1) AS t1;
Then we would get:
fooid | foosubid | fooname -------+----------+--------- 1 | 1 | Joe 1 | 2 | Ed (2 rows)
It is also possible to return multiple rows with the columns defined by output parameters, like this:
CREATE TABLE tab (y int, z int); INSERT INTO tab VALUES (1, 2), (3, 4), (5, 6), (7, 8); CREATE FUNCTION sum_n_product_with_tab (x int, OUT sum int, OUT product int) RETURNS SETOF record AS $$ SELECT $1 + tab.y, $1 * tab.y FROM tab; $$ LANGUAGE SQL; SELECT * FROM sum_n_product_with_tab(10); sum | product -----+--------- 11 | 10 13 | 30 15 | 50 17 | 70 (4 rows)
The key point here is that you must write RETURNS SETOF record to indicate that the function returns multiple rows instead of just one. If there is only one output parameter, write that parameter's type instead of record.
It is frequently useful to construct a query's result by invoking a set-returning function multiple times, with the parameters for each invocation coming from successive rows of a table or subquery. The preferred way to do this is to use the LATERAL key word, which is described in Section 7.2.1.5. Here is an example using a set-returning function to enumerate elements of a tree structure:
SELECT * FROM nodes; name | parent -----------+-------- Top | Child1 | Top Child2 | Top Child3 | Top SubChild1 | Child1 SubChild2 | Child1 (6 rows) CREATE FUNCTION listchildren(text) RETURNS SETOF text AS $$ SELECT name FROM nodes WHERE parent = $1 $$ LANGUAGE SQL STABLE; SELECT * FROM listchildren('Top'); listchildren -------------- Child1 Child2 Child3 (3 rows) SELECT name, child FROM nodes, LATERAL listchildren(name) AS child; name | child --------+----------- Top | Child1 Top | Child2 Top | Child3 Child1 | SubChild1 Child1 | SubChild2 (5 rows)
This example does not do anything that we couldn't have done with a simple join, but in more complex calculations the option to put some of the work into a function can be quite convenient.
Currently, functions returning sets can also be called in the select list of a query. For each row that the query generates by itself, the function returning set is invoked, and an output row is generated for each element of the function's result set. Note, however, that this capability is deprecated and might be removed in future releases. The previous example could also be done with queries like these:
SELECT listchildren('Top'); listchildren -------------- Child1 Child2 Child3 (3 rows) SELECT name, listchildren(name) FROM nodes; name | listchildren --------+-------------- Top | Child1 Top | Child2 Top | Child3 Child1 | SubChild1 Child1 | SubChild2 (5 rows)
In the last SELECT,
notice that no output row appears for Child2, Child3, etc.
This happens because listchildren
returns an empty set
for those arguments, so no result rows are generated. This is the same
behavior as we got from an inner join to the function result when using
the LATERAL syntax.
Note: If a function's last command is INSERT, UPDATE, or DELETE with RETURNING, that command will always be executed to completion, even if the function is not declared with SETOF or the calling query does not fetch all the result rows. Any extra rows produced by the RETURNING clause are silently dropped, but the commanded table modifications still happen (and are all completed before returning from the function).
Note: The key problem with using set-returning functions in the select list, rather than the FROM clause, is that putting more than one set-returning function in the same select list does not behave very sensibly. (What you actually get if you do so is a number of output rows equal to the least common multiple of the numbers of rows produced by each set-returning function.) The LATERAL syntax produces less surprising results when calling multiple set-returning functions, and should usually be used instead.
There is another way to declare a function as returning a set, which is to use the syntax RETURNS TABLE(columns). This is equivalent to using one or more OUT parameters plus marking the function as returning SETOF record (or SETOF a single output parameter's type, as appropriate). This notation is specified in recent versions of the SQL standard, and thus may be more portable than using SETOF.
For example, the preceding sum-and-product example could also be done this way:
CREATE FUNCTION sum_n_product_with_tab (x int) RETURNS TABLE(sum int, product int) AS $$ SELECT $1 + tab.y, $1 * tab.y FROM tab; $$ LANGUAGE SQL;
It is not allowed to use explicit OUT or INOUT parameters with the RETURNS TABLE notation — you must put all the output columns in the TABLE list.
SQL functions can be declared to accept and
return the polymorphic types anyelement,
anyarray, anynonarray,
anyenum, and anyrange. See Section 35.2.5 for a more detailed
explanation of polymorphic functions. Here is a polymorphic
function make_array
that builds up an array
from two arbitrary data type elements:
CREATE FUNCTION make_array(anyelement, anyelement) RETURNS anyarray AS $$ SELECT ARRAY[$1, $2]; $$ LANGUAGE SQL; SELECT make_array(1, 2) AS intarray, make_array('a'::text, 'b') AS textarray; intarray | textarray ----------+----------- {1,2} | {a,b} (1 row)
Notice the use of the typecast 'a'::text to specify that the argument is of type text. This is required if the argument is just a string literal, since otherwise it would be treated as type unknown, and array of unknown is not a valid type. Without the typecast, you will get errors like this:
ERROR: could not determine polymorphic type because input has type "unknown"
It is permitted to have polymorphic arguments with a fixed return type, but the converse is not. 例如:
CREATE FUNCTION is_greater(anyelement, anyelement) RETURNS boolean AS $$ SELECT $1 > $2; $$ LANGUAGE SQL; SELECT is_greater(1, 2); is_greater ------------ f (1 row) CREATE FUNCTION invalid_func() RETURNS anyelement AS $$ SELECT 1; $$ LANGUAGE SQL; ERROR: cannot determine result data type DETAIL: A function returning a polymorphic type must have at least one polymorphic argument.
Polymorphism can be used with functions that have output arguments. 例如:
CREATE FUNCTION dup (f1 anyelement, OUT f2 anyelement, OUT f3 anyarray) AS 'select $1, array[$1,$1]' LANGUAGE SQL; SELECT * FROM dup(22); f2 | f3 ----+--------- 22 | {22,22} (1 row)
Polymorphism can also be used with variadic functions. 例如:
CREATE FUNCTION anyleast (VARIADIC anyarray) RETURNS anyelement AS $$ SELECT min($1[i]) FROM generate_subscripts($1, 1) g(i); $$ LANGUAGE SQL; SELECT anyleast(10, -1, 5, 4); anyleast ---------- -1 (1 row) SELECT anyleast('abc'::text, 'def'); anyleast ---------- abc (1 row) CREATE FUNCTION concat_values(text, VARIADIC anyarray) RETURNS text AS $$ SELECT array_to_string($2, $1); $$ LANGUAGE SQL; SELECT concat_values('|', 1, 4, 2); concat_values --------------- 1|4|2 (1 row)
When a SQL function has one or more parameters of collatable data types,
a collation is identified for each function call depending on the
collations assigned to the actual arguments, as described in Section 22.2. If a collation is successfully identified
(i.e., there are no conflicts of implicit collations among the arguments)
then all the collatable parameters are treated as having that collation
implicitly. This will affect the behavior of collation-sensitive
operations within the function. For example, using the
anyleast
function described above, the result of
SELECT anyleast('abc'::text, 'ABC');
will depend on the database's default collation. In C locale the result will be ABC, but in many other locales it will be abc. The collation to use can be forced by adding a COLLATE clause to any of the arguments, for example
SELECT anyleast('abc'::text, 'ABC' COLLATE "C");
Alternatively, if you wish a function to operate with a particular
collation regardless of what it is called with, insert
COLLATE clauses as needed in the function definition.
This version of anyleast
would always use en_US
locale to compare strings:
CREATE FUNCTION anyleast (VARIADIC anyarray) RETURNS anyelement AS $$ SELECT min($1[i] COLLATE "en_US") FROM generate_subscripts($1, 1) g(i); $$ LANGUAGE SQL;
But note that this will throw an error if applied to a non-collatable data type.
If no common collation can be identified among the actual arguments, then a SQL function treats its parameters as having their data types' default collation (which is usually the database's default collation, but could be different for parameters of domain types).
The behavior of collatable parameters can be thought of as a limited form of polymorphism, applicable only to textual data types.