分類彙整: IBM Programming Contest

SQL and XQuery Tutorial for IBM DB2 - XQuery

Adopt from
https://www6.software.ibm.com/dw/education/dm/dm0608cao7/section2.html

When to use XML

Situations in which an XML representation is beneficial include:

  • When schema is volatile.
    If the schema of the data changes often, then representing it in relational form may be onerous given the cost and difficulty of changing the associated relational schema. The self-describing nature of XML makes schema modification much simpler.
  • When data is inherently hierarchical.
    Some data is inherently tabular, and a relational model makes the most sense for it. Other data is naturally hierarchical, and XML is often the best representation for it.
  • When data represents business objects in which the component parts do not make sense when removed from the context of that business object.
    For example, consider a standard employee and phone number relationship in which one employee can have multiple phone numbers: one for the office, one for fax, one for home, and one for mobile. If the most frequent usage pattern is to retrieve phone numbers in the context of the employee, it does not make sense to normalize the data and introduce a table solely to track the multiplicity of phone numbers associated with an employee. A better choice may be to keep those phone numbers in the employee relation and represent them using XML.
  • When applications have sparse attributes.
    Some applications have a large number of possible attributes, most of which are sparse or absent for any given data value. A classic example is a merchant catalog; the number of different attributes to track for a given catalog item is huge, including size, color, weave, power requirements and a nearly endless list of other considerations. For any given object, only a subset of these attributes is relevant; the weave of a sweater makes sense but the weave of a lawn mower is nonsensical. Using a relational table to describe the characteristics of the object can be costly and overly complex. Representing such descriptive attribute information as XML data allows for more natural representation as well as less complex and expensive searching.
  • When low-volume data is highly structured.
    In many applications, structured information is critical to the application but it exists in very small quantities. While that information can be represented in a normal relation, this approach can lead to massive relational schemas. Using an XML column with multiple views can dramatically reduce the number of managed objects in a database, and thus reduce the cost of ownership.

Using XQuery
To execute an XQuery directly in DB2 9, you must preface the query with the keyword

xquery

.
Case-sensitive

FLWOR Expression

1.

for
Iterates through an input sequence, binding a variable to each input item in turn
2.

let
Declares a variable and assigns it a value, which may be a list containing multiple items
3.

where
Specifies criteria for filtering query results
4.

order by
Specifies the sort order of the result
5.

return
Defines the result to be returned

Path Expression

Separated by slash(/) characters
Each step in a path expression may also contain a predicate that filters the elements that are returned by that step, retaining only elements that satisfy some condition.

$comments/comment/feedback[type = "suggestion"]/content

Same Result Query

FLWOR Expression

for $y in db2-fn:xmlcolumn('AROMA.SALES.COMMENTS')
return $y

Path Expression

db2-fn:xmlcolumn('AROMA.SALES.COMMENTS')
db2-fn:xmlcolumn

with a parameter that identifies the table name and column name of an XML column in a DB2 table. It returns the sequence of XML documents that is stored in the given column.

The two queries, although producing the same result, are executed slightly differently.

For the FLWOR query, the second line instructs DB2 to iterate through the content sub-elements of the <Comments> elements contained in the aroma.sales.comments column. Each content element is bound in turn to the variable $y. The third line indicates that, for each iteration, the value of $y is returned. The result is a sequence of XML elements.

For the path query, the first step calls the

db2-fn:xmlcolumn

function to obtain a list of XML documents from the Comments column of the aroma.sales table. The second step returns all the <Comments> elements in these documents, the third step returns the <comment> elements nested inside these <Comments> elements, the fourth step returns the <feedback> elements nested inside the <comment> elements, and the fifth step returns the <content> elements nested inside the <feedback> elements.

Text()

db2-fn:xmlcolumn('AROMA.SALES.COMMENTS')/Comments/comment/feedback/content/text()

Filtering - where

xquery
for $y in db2-fn:xmlcolumn('AROMA.SALES.COMMENTS')/Comments/comment/feedback
where $y/type = "suggestion" or $y/type = "question"
return $y/content/text()
db2-fn:xmlcolumn('AROMA.SALES.COMMENTS')/Comments/comment/feedback[type = "suggestion" or type = "question"]/content/text()

HTML Output using FLWOR

xquery
<ul> {
for $y in db2-fn:xmlcolumn('AROMA.SALES.COMMENTS')/Comments/comment/store_response
where $y/completed = "no"
return <li>{$y/action/text()}</li>
} </ul>

XML Output using FLWOR

xquery
for $y in db2-fn:xmlcolumn('AROMA.SALES.COMMENTS')/Comments/comment
where $y/store_response/completed = "no"
return
	(
    if ($y/feedback/type = 'suggestion')
        then
            <suggestion>
            {$y/comment_ID,
             $y/customer_info,
             $y/store_response/action}
            </suggestion>
        else
            <question>
            {$y/comment_ID,
             $y/customer_info,
             $y/store_response/action}
            </question>
)

Order By

xquery
for $y in db2-fn:xmlcolumn('AROMA.SALES.COMMENTS')/Comments/comment
where $y/feedback/type = "opinion"
order by $y/store_rating/score
return
    <comment>
        <rating>{$y/store_rating/score/text()}</rating>
        <opinion>{$y/feedback/content/text()}</opinion>
    </comment>

{ } is similar to <?php ?>

distinct-values

xquery
for $t in distinct-values
(db2-fn:xmlcolumn('AROMA.SALES.COMMENTS')/Comments/comment/feedback/type)
let $tc := db2-fn:xmlcolumn('AROMA.SALES.COMMENTS')/Comments/comment[feedback/type = $t]
return
    <comments>
        <type>{$t}</type>
        <count>{count($tc)}</count>
    </comments>

:= is the same as it is in Pascal

Average - avg

xquery
avg(db2-fn:xmlcolumn('AROMA.SALES.COMMENTS')/Comments/comment/store_rating/score)

XQueries with embedded SQL - db2-fn:sqlquery()

for $y in
db2-fn:sqlquery(
    'SELECT Comments
     FROM aroma.sales a, aroma.store b
     WHERE city = ''New York''
         AND a.storekey = b.storekey')/Comments/comment/feedback/content/text()
return $y
db2-fn:sqlquery(

SQL and XQuery tutorial for IBM DB2 - Subqueries

Adopt from
http://www.ibm.com/developerworks/edu/dm-dw-dm-0608cao5-i.html

CASE Syntax - Change the column data properties

SELECT prod_name,
    SUM(CASE WHEN store_name = 'Beaches Brew'
    then dollars else 0 end) AS Beaches,
    SUM(CASE WHEN store_name = 'Cupertino Coffee Supply'
    then dollars else 0 end) AS Cupertino,
    SUM(CASE WHEN store_name = 'Roasters, Los Gatos'
    then dollars else 0 end) AS RoastLG,
    SUM(CASE WHEN store_name = 'San Jose Roasting Company'
    then dollars else 0 end) AS SJRoastCo,
    SUM(CASE WHEN store_name = 'Java Judy''s'
    then dollars else 0 end) AS JavaJudy,
    SUM(CASE WHEN store_name = 'Instant Coffee'
    then dollars else 0 end) AS Instant
FROM aroma.market a,
     aroma.store b,
     aroma.period c,
     aroma.product d,
     aroma.class e,
     aroma.sales f
WHERE   a.mktkey = b.mktkey
    AND b.storekey = f.storekey
    AND c.perkey = f.perkey
    AND d.classkey = e.classkey
    AND d.classkey = f.classkey
    AND d.prodkey = f.prodkey
    AND region LIKE 'West%'
    AND year = 2004
    AND class_type = 'Pkg_coffee'
GROUP BY prod_name
ORDER BY prod_name;

Subqueries

SELECT prod_name, store_name, date, dollars AS sales_05,
	    (SELECT DEC(AVG(dollars),7,2)
	    FROM aroma.product a,
	         aroma.sales b,
	         aroma.period c,
	         aroma.store d
	    WHERE   a.prodkey = b.prodkey
	        AND a.classkey = b.classkey
	        AND c.perkey = b.perkey
	        AND d.storekey = b.storekey
	        AND year = 2004
	        AND month = 'DEC'
	        AND store_name = 'San Jose Roasting Company'
	        AND prod_name LIKE 'Lotta%') AS avg_04
	FROM aroma.product a,
	     aroma.sales b,
	     aroma.period c,
	     aroma.store d
	WHERE   a.prodkey = b.prodkey
	    AND a.classkey = b.classkey
	    AND c.perkey = b.perkey
	    AND d.storekey = b.storekey
	    AND prod_name LIKE 'Lotta%'
	    AND store_name = 'San Jose Roasting Company'
	    AND year = 2005
	    AND month = 'DEC'
	    AND dollars <
	        (SELECT AVG(dollars)
	        FROM aroma.product a,
	             aroma.sales b,
	             aroma.period c,
	             aroma.store d
	        WHERE   a.prodkey = b.prodkey
	            AND a.classkey = b.classkey
	            AND c.perkey = b.perkey
	            AND d.storekey = b.storekey
	            AND year = 2004
	            AND month = 'DEC'
	            AND store_name = 'San Jose  Roasting Company'
	            AND prod_name LIKE 'Lotta%');

Using Subquery in Group By

SELECT sales1.product, jan_04_sales, total_04_sales,
	    DEC((100.00 * jan_04_sales/total_04_sales),7,2) AS pct_of_04
	FROM
	    (SELECT a1.prod_name, SUM(dollars)
	    FROM aroma.product a1,
	         aroma.sales b1,
	         aroma.period c1,
	         aroma.store d1
	    WHERE   a1.prodkey = b1.prodkey
	        AND a1.classkey = b1.classkey
	        AND c1.perkey = b1.perkey
	        AND d1.storekey = b1.storekey
	        AND c1.year = 2004
	        AND c1.month = 'JAN'
	        AND d1.city LIKE 'San J%'
	    GROUP BY a1.prod_name) AS sales1(product, jan_04_sales)
	,
	    (SELECT a2.prod_name, SUM(dollars)
	    FROM aroma.product a2,
	         aroma.sales b2,
	         aroma.period c2,
	         aroma.store d2
	    WHERE   a2.prodkey = b2.prodkey
	        AND a2.classkey = b2.classkey
	        AND c2.perkey = b2.perkey
	        AND d2.storekey = b2.storekey
	        AND c2.year = 2004
	        AND d2.city LIKE 'San J%'
	    GROUP BY a2.prod_name) AS sales2(product, total_04_sales)
	WHERE sales1.product = sales2.product
	ORDER BY pct_of_04 DESC
	FETCH FIRST 10 ROWS ONLY;

SQL0412N Multiple columns are returned from a subquery that is allowed only one column.

Comparison predicates in subqueries

SELECT store_name
		FROM aroma.store
		WHERE city = ANY
		    (SELECT hq_city
		    FROM aroma.market)
		ORDER BY state, city, store_name;

PREDICATE
-EVALUATES TO "TRUE" WHEN
-WHEN NO VALUE IS RETURNED

ALL
-The comparison is true for all values returned by the subquery.
-Evaluates to true.
ANY, SOME
-The comparison is true for at least one of the values returned by the subquery.
-Evaluates to false.
EXISTS
-The subquery produces at least one row.
-Evaluates to false.

SQL and XQuery tutorial for IBM DB2 - Data analysis

Adopt from
https://www6.software.ibm.com/dw/education/dm/dm0608cao4/index.html
http://it.toolbox.com/blogs/db2luw/olap-sql-part-5-windowing-aggregates-8373

On-Line Analytical Processing (OLAP) functions provide the ability to return ranking, row numbering and existing column function information as a scalar value in a query result.

SELECT date, SUM(dollars) AS total_dollars,
    SUM(SUM(dollars)) OVER(ORDER BY date ROWS UNBOUNDED PRECEDING) AS run_dollars,
    SUM(quantity) AS total_qty,
    SUM(SUM(quantity)) OVER(ORDER BY date ROWS UNBOUNDED PRECEDING) AS run_qty
	FROM aroma.period a, aroma.sales b, aroma.product c
	WHERE   a.perkey = b.perkey
	    AND c.prodkey = b.prodkey
	    AND c.classkey = b.classkey
	    AND year = 2006
	    AND month = 'JAN'
	    AND prod_name = 'Aroma Roma'
	GROUP BY date
	ORDER BY date;

OLAP ROW_NUMBER function
You can even use OLAP functions for simple tasks such as providing line numbers in a result set, as in:

SELECT ROW_NUMBER() OVER() AS row_num, order_no, price
FROM aroma.orders;

Using PARTITION BY - Calculate on Each Partition Separately

SELECT date, SUM(dollars) AS total_dollars,
	    SUM(SUM(dollars)) OVER(PARTITION BY week ORDER BY date
	    ROWS UNBOUNDED PRECEDING) AS run_dollars,
	    SUM(quantity) AS total_qty,
	    SUM(SUM(quantity)) OVER(PARTITION BY week ORDER BY date
	    ROWS UNBOUNDED PRECEDING) AS run_qty
	FROM aroma.period a, aroma.sales b, aroma.product c
	WHERE   a.perkey = b.perkey
	    AND c.prodkey = b.prodkey
	    AND c.classkey = b.classkey
	    AND year = 2006
	    AND month = 'JAN'
	    AND prod_name = 'Aroma Roma'
	GROUP BY week, date
	ORDER BY week, date;

The DEC function is used to remove all but two of the decimal places from each Price value

Use Rank()

SELECT store_name, district, SUM(dollars) AS total_sales,
	    RANK() OVER(ORDER BY SUM(dollars) DESC) AS sales_rank
	FROM aroma.market a,
	     aroma.store b,
	     aroma.sales c,
	     aroma.period d
	WHERE   a.mktkey = b.mktkey
	    AND b.storekey = c.storekey
	    AND d.perkey = c.perkey
	    AND year = 2005
	    AND month = 'MAR'
	    AND region = 'West'
	GROUP BY store_name, district;

Using Date Arithmetic

SELECT date - 3 MONTHS - 4 DAYS AS due_date,
	    date AS cur_date,
	    date + 3 MONTHS + 4 DAYS AS past_due
	FROM aroma.period
	WHERE year = 2004
	    AND month = 'JAN';

You can similarly adjust aspects of any TIME datatype (HOURS, MINUTES, and/or SECONDS) or any TIMESTAMP datatype (YEARS, MONTHS, DAYS, HOURS, MINUTES, and/or SECONDS). For more information about DATE, TIME, and TIMESTAMP arithmetic.

Conditions on groups: HAVING clause

SELECT column name(s)
	FROM table name(s)
	[WHERE search_condition]
	[GROUP BY group_list]
	[HAVING condition]
	[ORDER BY order_list];

WHERE Clause
Works on rows of data prior to grouping.
Conditions cannot be expressed with set functions (for example, SUM or AVG), but column aliases for nonaggregate expressions can be used.

HAVING Clause
Works on the result set after grouping. 

 

 

 

 

Conditions can be expressed with any set function or column alias.

Any set function can be referenced in a condition in the HAVING clause. A query with a HAVING clause must contain a GROUP BY clause, unless the select list contains only set functions. For example:

SELECT MIN(prodkey), MAX(classkey)
	FROM aroma.product
	HAVING MIN(prodkey) = 0;

SQL and XQuery tutorial for IBM DB2 - UNION, INTERSECT and EXCEPT

Adopt from
https://www6.software.ibm.com/dw/education/dm/dm0608cao3/section12.html
https://www6.software.ibm.com/dw/education/dm/dm0608cao3/section13.html
https://www6.software.ibm.com/dw/education/dm/dm0608cao3/section15.html
http://msdn.microsoft.com/zh-tw/library/ms191255.aspx

UNION

query_expression UNION | INTERSECT | EXCEPT [ALL] query_expression
[ORDER BY order_list];
使用 EXCEPT,下列查詢會傳回 EXCEPT 運算元左邊查詢結果有、但右邊查詢沒有的任何相異值。

query_expression - Any join or nonjoin query expression, as defined in the SQL Reference Guide .

If the ORDER BY clause is used, the values must reference columns from the select list of the first query expression. Using column names from the second query expression results in an error.

ORDER BY 1, 2

Example UNION query

		SELECT store_name AS store, store_type AS size,
		       state, SUM(dollars) AS sales
		FROM     aroma.period p
		    JOIN aroma.sales s ON p.perkey = s.perkey
		    JOIN aroma.store r ON r.storekey = s.storekey
		WHERE   store_type = 'Medium'
		    AND year = 2005
		    AND week = 52
		GROUP BY store_name, store_type, state

		UNION

		SELECT store_name AS store, store_type AS size,
		       state, SUM(dollars)
		FROM     aroma.period p
		    JOIN aroma.sales s ON p.perkey = s.perkey
		    JOIN aroma.store r ON r.storekey = s.storekey
		WHERE   store_type = 'Large'
		    AND year = 2005
		    AND week = 52
		GROUP BY store_name, store_type, state
		ORDER BY 2, 1;

Example query (UNION ALL)

		SELECT city
		FROM aroma.store
		UNION ALL
		SELECT hq_city AS city
		FROM aroma.market
		ORDER BY city;

Using UNION ALL if you want to repeat values that occur more than once in the list of unioned objects. For example, you may want to count the total number of objects in each case.

SQL and XQuery tutorial for IBM DB2 - Group By Clause

Adopt from
https://www6.software.ibm.com/dw/education/dm/dm0608cao3/section10.html

Set functions operate on all rows of a result table or on groups of rows defined by a GROUP BY clause. For example, you can group the sales for each market and calculate the respective sum, maximum, and minimum values.

Syntax of the GROUP BY clause

		SELECT column name(s)
		FROM table name(s)
		[WHERE search_condition]
		[GROUP BY group_list]
		[ORDER BY order_list];

Using Multiple groups

SELECT year, region, district, city, SUM(dollars) AS sales
FROM aroma.store a, aroma.sales b, aroma.product c,
     aroma.period d, aroma.market e
WHERE   a.storekey = b.storekey
    AND c.prodkey = b.prodkey
    AND c.classkey = b.classkey
    AND d.perkey = b.perkey
    AND e.mktkey = a.mktkey
    AND year IN (2004, 2005)
GROUP BY year, region, district, city
ORDER BY year, region, district, city;

SQL and XQuery tutorial for IBM DB2 - SQL Joins and Aggregates

Adopt from
https://www6.software.ibm.com/dw/education/dm/dm0608cao3/dm0608cao3-pdf.pdf
http://en.wikipedia.org/wiki/Join_(SQL)
http://www.1keydata.com/tw/sql/sqljoins.html

Table Aliases
SELECT date, dollars
FROM aroma.period a, aroma.sales b
WHERE a.perkey=b.perkey
AND a.month='JAN'
AND a.year=2006;

Simple Join
SELECT prod_name, store_name, day, dollars
FROM aroma.promotion a, aroma.product b,
     aroma.period c, aroma.store d, aroma.sales e
WHERE       a.promokey = e.promokey
    AND    b.prodkey = e.prodkey
    AND    b.classkey = e.classkey
    AND    c.perkey = e.perkey
    AND    d.storekey = e.storekey

    AND     prod_name LIKE 'Easter%'
    AND     day IN ('SA', 'SU')
    AND     promo_type = 900
    AND     year = 2005;

Self-joins
The tables being joined in a query do not need to be distinct; you can join any table to itself as long as you give each table reference a different name. Self-joins are useful for discovering relationships between different columns of data in the same table. 

不同的 SQL JOIN
除了我们在上面的例子中使用的 INNER JOIN(内连接),我们还可以其他几种连接。
下面列出了您可以使用的 JOIN 类型,以及它们之间的差异。

  • JOIN: 如果表中有至少一个匹配,则返回行
  • LEFT JOIN: 即使右表中没有匹配,也从左表返回所有的行
  • RIGHT JOIN: 即使左表中没有匹配,也从右表返回所有的行
  • FULL JOIN: 只要其中一个表中存在匹配,就返回行

Innter Join and Outer Join
In most cases, tables are joined according to search conditions that find only the rows with matching values; this type of join is known as an inner join. In some cases, however, decision-support analysis requires outer joins, which retrieve both matching and non-matching rows, which express, for example, a greater-than or less-than relationship.

An outer join operation returns all the rows returned by an inner join plus all the rows from one table that do not match any row from the other table. An outer join can be left, right, or full, depending on whether rows from the left, right, or both tables are retained. The first table listed in the FROM clause is referred to as the left table and the second as the right table. For all three types of outer join, NULLs are used to represent empty columns in rows that do not match.

FROM table_1 LEFT|RIGHT|FULL OUTER JOIN table_2
    ON table_1.column = table_2.column

Set Functions
 SUM(expression) - Calculates the sum of all the values in expression.
 SUM(DISTINCT expression) - Calculates the sum of distinct values in expression.
 AVG(expression) - Calculates the average of all the values in expression.
 AVG(DISTINCT expression) - Calculates the average of distinct values in expression.
 MAX(expression) - Determines the maximum value in expression.
 MIN(expression) - Determines the minimum value in expression.
 COUNT(*) - Counts the number of rows returned.
 COUNT(expression) - Counts the number of non-null values in expression.
 COUNT(DISTINCTexpression) - Counts the number of distinct non-null values in expression.

You can replace expression with any column name or numeric expression. Each function, except COUNT(*), ignores NULL values when calculating the returned aggregate value.

SELECT SUM(dollars) as Dol_Sales, AVG(dollars) as Avg_Sales,
MAX(dollars) as Max_Sales, MIN(dollars) as Min_Sales, COUNT(*) as Qty
FROM aroma.store a, aroma.sales b, aroma.product c, aroma.period d
WHERE   a.storekey = b.storekey
    AND c.prodkey = b.prodkey
    AND c.classkey = b.classkey
    AND d.perkey = b.perkey
    AND prod_name LIKE 'Lotta Latte%'
    AND year = 2005
    AND city LIKE 'Los Ang%';

as change the field name

What is Database Schemas

Adopt from
http://blog.sqlauthority.com/2009/09/07/sql-server-importance-of-database-schemas-in-sql-server/

A schema is now an independent entity- a container of objects distinct from the user who created those objects.

Implications
The separation of ownership from schemas has important implications:

  • Ownership of schemas and schema-owned objects is transferable. This is accomplished using the ALTER AUTHORIZATION command.
  • Objects can be moved between schemas. This is accomplished using the ALTER SCHEMA command.
  • A single schema can contain objects owned by multiple database users.
  • Multiple database users can share a single default schema.
  • Permissions on schemas and schema-contained objects can be managed with greater precision than in earlier releases. This is accomplished using schema GRANT permissions object GRANT permissions.
  • A schema can be owned by any database principal. This includes roles and application roles.
  • A database user can be dropped without dropping objects in a corresponding schema.
  • Code written for earlier releases of SQL Server may return incorrect results, if the code assumes that schemas are equivalent to database users.
  • Catalog views designed for earlier releases of SQL Server may return incorrect results. This includes sysobjects.
  • Object access and manipulation are now more complex as well as more secure since they involve an additional layer of security.

SQL and XQuery tutorial for IBM DB2 - Basic queries

Adopt from
https://www6.software.ibm.com/dw/education/dm/dm0608cao2/index.html

CONNECT TO aromadb USER userid USING password

SELECT perkey, sum(dollars)
FROM aroma.sales
WHERE perkey < 50
GROUP BY perkey
HAVING sum(dollars) > 8000
ORDER BY perkey;

SELECT mktkey, hq_city, hq_state, district, region FROM aroma.market

SELECT DISTINCT district, region
FROM aroma.market;

WHERE clause is case-sensitive.

<> not equal
= equal

A simple condition can be expressed with the following SQL comparison predicates:

Predicate
BETWEEN expression1 AND expression2
LIKE pattern
IN (list)
IS NULL
IS NOT NULL
ALL
SOME or ANY
EXISTS

SELECT hq_city, hq_state, district
FROM aroma.market
WHERE district IN ('Chicago', 'New York', 'New Orleans');

Wildcard characters
With the LIKE predicate and the two wildcard characters, the percent sign (%) and the underscore (_), you can also express conditions that match a portion of a character string (a substring).
SELECT district, hq_city
FROM aroma.market
WHERE district LIKE 'Min%';

AS Clause
SELECT district, hq_city AS City
FROM aroma.market
WHERE district LIKE 'Min%';
Result

District City
Minneapolis Minneapolis
Minneapolis Milwaukee