Convert rows into comma separated values in a column 9


In this post, we are going to learn a technique to combine all the values of a row (based on a particular condition) with a separator, in a column along with other columns. What i meant by this let me explain it in more detail. Have a look on below question as raised by a colleague;

I have a Students table and a Subjects table. One student has been assigned to multiple Subjects in a transaction table. I want to extract the name of all the Subjects for which the student is assigned in a comma separated way along with other details of the student. If student is not assigned yet, i want to show a “Not Assigned Yet” value for those records.

To achieve this task, i have created a demo script and sharing the same with all the details. Below you can find the table structures and demo data.

We are creating three tables as Students, Subjects and StudentSubjectMappings for our demo.

--Creating Students Table
CREATE TABLE dbo.Students
(
StudentID INT,
StudentName NVARCHAR(100),
Class VARCHAR(10),
)
GO

--Creating Subjects Table
CREATE TABLE dbo.Subjects
(
SubjectID INT,
SubjectName NVARCHAR(100),
)
GO

--Creating Student and subject mapping in a transaction table named StudentSubjectMappings
CREATE TABLE dbo.StudentSubjectMappings
(
StudentID INT,
SubjectID INT
)
GO

Insert some demo values in all these created tables as below;

INSERT INTO dbo.Students(StudentID, StudentName, Class)
VALUES(1, 'Student 1', 'X'),
(2, 'Student 2', 'XI'),
(3, 'Student 3', 'XII'),
(4, 'Student 4', 'X'),
(5, 'Student 5', 'XI'),
(6, 'Student 6', 'XII')
GO

INSERT INTO dbo.Subjects(SubjectID, SubjectName)
VALUES(1, 'Physics'),
(2, 'Chemistry'),
(3, 'Mathematics'),
(4, 'English'),
(5, 'Social Science'),
(6, 'Hindi')
GO

INSERT INTO dbo.StudentSubjectMappings(StudentID, SubjectID)
VALUES(1, 1), (1, 2), (1, 3), (1, 4), (1, 5), (1, 6),
(2, 1), (2, 2), (2, 3), (2, 4),
(3, 5), (3, 6)
GO

Now have a look on the tables data.

SELECT * FROM dbo.Students

rows to comma separated values

SELECT * FROM dbo.Subjects

rows to comma separated values 2

SELECT * FROM dbo.StudentSubjectMappings

rows to comma separated values 3

Required Output:

rows to comma separated values 4

 

Now to get the final output as required, use this query;

SELECT StudentID, StudentName, Class,
ISNULL(STUFF((SELECT ', ' + SubjectName
FROM dbo.StudentSubjectMappings SSM
INNER JOIN dbo.Subjects SUB ON SUB.SubjectID = SSM.SubjectID
WHERE SSM.StudentID = STD.StudentID
FOR XML PATH('')
), 1, 1, ''), 'Not Assigned Yet') AS SubjectList
FROM dbo.Students STD

Let’s discuss this trick:

To achieve this task, i have used “FOR XML” clause with “PATH” mode inside a sub-query. FOR XML clause is used to retrieve the result of a SQL query as XML data and “PATH” mode for the “FOR XML” clause provides a way to mix elements and attributes.

In above query, i have used this trick. I have used a blank element name for PATH mode inside PATH(”). It will cause to create a XML data with blank element name. Also nothing is specified as a column name for expression ‘, ‘ + SubjectName which is an expression for a column in our subquery. This blank column name will cause a blank child element name for the created XML. So in our XML data, we don’t have any element name and it’s just a string created from all the rows.

I have prefixed all the row values with a comma and then removed it from start which is easy to find. To remove it, i have used a STUFF function to do so. Adding the same comma at end and then removing it from end will need to find the length of the string to remove it from end. Now for each row in upper query, the sub query will execute and find all the rows in “dbo.Subject” table and then it will combine all the values in one row as described above.

##Below is an addition from comment section to highlight this important point in main post##

HTML character handling in FOR XML PATH

Pointed by : [Jeff Moden] & Answered by :  [John Imel]

Thanks to @Jeff Moden for highlighting this point and @John Imel for answering this nicely in comment section of this post. It’s really appreciable and helpful. Hats off to both of you!

From comment section i am here adding a provision to handle html characters in above code. I am leaving the above code untouched, hence in case you have to use the output of above query in an html page where html encoded values will be decoded back, it will work fine. But in case you have to display it as it is somewhere, it will encode the html characters like;
& sign to @amp; and < sign to &lt; and >  sign to &gt; etc.

To avoid such kind of encoding use this code;

SELECT StudentID, StudentName, Class,
ISNULL(STUFF((SELECT ', ' + COALESCE(SubjectName, '')
FROM dbo.StudentSubjectMappings SSM
INNER JOIN dbo.Subjects SUB ON SUB.SubjectID = SSM.SubjectID
WHERE SSM.StudentID = STD.StudentID
FOR XML PATH(''), TYPE).value('.[1]', 'NVARCHAR(MAX)')
, 1, 1, ''), 'Not Assigned Yet') AS SubjectList
FROM dbo.Students STD

Here, TYPE directive forces the data type of this FOR XML query as xml data type on which we can apply xml type methods. In this query, value method has been applied with ‘.[1]’ as XQuery expression and NVARCHAR(MAX) as SQLType.

##Addition ends here##

A Question For You:

Are you interested to answer a simple question asked below?

Q : What will be the output of the below query:

SELECT LEN('ABC ')

Length of ABC string followed by 3 blank spaces using LEN function in SQL Server.

It may be a silly question for you but trust me, it can help some others to explore the way LEN function works.

I request you to try it without executing it in SQL and put your answer in comment section honestly. You can also put your explanation to help others.

Thanks for your support and patience.

Rate This
[Total: 3    Average: 4.7/5]


Gopal Krishna Ranjan

About Gopal Krishna Ranjan

I am Gopal Krishna Ranjan, having 8 years of industry experience in Software development. I have a head down experience in Database, Data Warehouse, Big Data and cloud technologies and have implemented end to end Database, Data Warehouse,  Big Data and Cloud Solutions.
I have extensively worked on SQL Server, Python, Hadoop, Hive, Spark, Azure, Machine Learning, and MSBI (SSAS, SSIS, and SSRS). I also have good experience in windows and web application development using ASP.Net and C#.


Leave a Reply to john Cancel reply

Your email address will not be published. Required fields are marked *

*

9 thoughts on “Convert rows into comma separated values in a column

  • inetryconydot

    ORDER CHEAP TADACIP ON LINE WITHOUT PRESCRIPTION!

    * TOP PHARMACY LIST!
    * MEDICATION WITHOUT A PRESCRIPTION!
    * YOU CAN FIND: THE BEST QUALITY PILLS!
    * WE GUARANTEE THAT ONCE YOU HAVE PURCHASED A PRODUCT FROM US YOU WILL GET THAT PRODUCT

    WE THANK YOU FOR VISITING APPROVED ONLINE PHARMACY © 2013.

    [b]Tags:[/b][b]Tadacip Over The Counter, Tadacip Online No Prescription, Where To Buy Tadacip, Tadacip Without Prescriptions, Can You Buy Tadacip Online, Where To Buy Tadacip[/b]

  • Bikesh srivastava

    // Create coloumn with seprated column value.from row value
    SELECT DISTINCT p.[Account ID],p.[Contact ID],p.[Account Name],p.[Email],p.[Contact Role],
    STUFF((SELECT distinct ‘,’ + p1.Role
    FROM finalsheet p1
    WHERE p.[Contact ID]= p1.[Contact ID]
    FOR XML PATH(”), TYPE
    ).value(‘.’, ‘NVARCHAR(MAX)’)
    ,1,1,”) RoleS
    FROM finalsheet p
    –group by p.[Account ID],p.[Contact ID],p.[Account Name],p.[Email],p.[Contact Role]

  • Zeid

    And here is another way to do it if you are trying avoid XML. Though xml is more flexible, efficient and dynamic.

    ;With TT as (
    select ST.StudentID, CLASS, s.SubjectID, SubjectName from dbo.Subjects s join dbo.StudentSubjectMappings gs on s.SubjectID = gs.SubjectID
    RIGHT JOIN STUDENTS ST ON GS.STUDENTID = ST.STUDENTID)
    , TTT AS
    (select *
    from ( select StudentID,CLASS, SubjectID, SubjectName from TT) src
    pivot
    ( MAX(SubjectName)for SubjectID in ([1], [2], [3],[4], [5], [6])) piv)
    , TTTT AS
    (SELECT STUDENTID,CLASS, ISNULL([1], ”)+’,’+ISNULL([2],”)+’,’+ISNULL([3],”)+’,’+ISNULL([4],”)+’,’+ISNULL([5],”)+’,’+ISNULL([6], ”)[SUBJECT]
    FROM TTT )
    ,TTTTT AS(SELECT STUDENTID,CLASS, REPLACE([SUBJECT], ‘,,’,”) [SUBJECT] FROM TTTT)
    SELECT StudentID ,Class, CASE
    WHEN LEFT([SUBJECT], 1)= ‘,’ THEN ‘Not Assigned Yet’
    ELSE [SUBJECT] END AS [SUBJECTLIST] FROM TTTTT ORDER BY StudentID

  • Jeff Moden

    Gopal,

    No… not “geniuses”. As John said, we’ve both been bit by these problems in the past. It’s just a bit of experience. Wink

  • Jeff Moden

    Spot on, John.

    There’s another impact to adding the “de-entitization” to the XML concatenation by using TYPE… it causes the query to take approximately 30% longer.

    Here are the run results I get on a million row test…

    –===== Test the XML concatenation WITH de-entitization.

    (50000 row(s) affected)

    SQL Server Execution Times:
    CPU time = 6692 ms, elapsed time = 7509 ms.
    –===== Test the XML concatenation WITHOUT de-entitization.

    (50000 row(s) affected)

    SQL Server Execution Times:
    CPU time = 5148 ms, elapsed time = 5446 ms.

    Don’t take my word for it though. Do your own testing. Here’s the test code I used. Remember not to take the first set of results because data will be cached in real life. Try adding some indexes and see what happens.

    DROP TABLE #TestTable
    GO
    –===== Create a million row test table.
    — SomeInt wil contain values from 1 to 50,000.
    — SomeLetters2 will contain values from “AA” thru “ZZ”
    — This only takes 2 seconds on my laptop. It uses a
    — “pseudeo-cursor” contained in every SELECT (thay all
    — loop behind the scenes and is amplified in quantity
    — by the CROSS JOIN.
    SELECT TOP 1000000
    RowNum = IDENTITY(INT,1,1)
    ,SomeInt = ABS(CHECKSUM(NEWID()))%50000+1
    ,SomeLetters2 = CHAR(ABS(CHECKSUM(NEWID()))%26+65)
    + CHAR(ABS(CHECKSUM(NEWID()))%26+65)
    INTO #TestTable
    FROM sys.all_columns ac1
    CROSS JOIN sys.all_columns ac2
    ;
    GO
    ———————————————————————
    PRINT ‘–===== Test the XML concatenation WITH de-entitization.’
    SET STATISTICS TIME ON
    ;
    SELECT SomeInt
    ,CSVList =
    STUFF(
    (
    SELECT ‘, ‘ + SomeLetters2
    FROM #TestTable tt2
    WHERE tt2.SomeInt = tt1.SomeInt
    ORDER BY SomeLetters2
    FOR XML PATH(”),TYPE
    ).value(‘.[1]’, ‘VARCHAR(max)’
    ), 1, 1, ”)
    FROM #TestTable tt1
    GROUP BY tt1.SomeInt
    ;
    SET STATISTICS TIME OFF
    ;
    GO
    ———————————————————————
    PRINT ‘–===== Test the XML concatenation WITHOUT de-entitization.’
    SET STATISTICS TIME ON
    ;
    SELECT SomeInt
    ,CSVList =
    STUFF(
    (
    SELECT ‘, ‘ + SomeLetters2
    FROM #TestTable tt2
    WHERE tt2.SomeInt = tt1.SomeInt
    ORDER BY SomeLetters2
    FOR XML PATH(”)
    )
    , 1, 1, ”)
    FROM #TestTable tt1
    GROUP BY tt1.SomeInt
    ;
    SET STATISTICS TIME OFF
    ;
    GO

  • Gopal Krishna Ranjan
    Gopal Krishna Ranjan

    Hi @Jeff Moden, thanks for highlighting this point missed completely by me and @John Imel, thanks for answering this nicely. It’s really appreciable and helpful.
    Both of you have been credited for your nice and helping effort in main post with an addition flag tag. You really deserve this credit.
    Hats off to both of you genius! Please keep posting your comments and suggestions.

  • john

    here you go JeffSmile

    1) 6
    2) 3 even though its nvarchar len still returns number of characters not number of bytes used less blanks
    3) 12 as its nvarchar and thus uses 2 bytes per character ( this one got me a few times in the past) so it returns 12 ie 6 character x 2bytes per character = 12

    the & sign would be converted to & ie for xml path converts special characters to html entities as its an xml specific function.

    here is the fix to it

    SELECT StudentID, StudentName, Class,
    ISNULL(STUFF((SELECT ‘, ‘ + SubjectName
    FROM dbo.StudentSubjectMappings SSM
    INNER JOIN dbo.Subjects SUB ON SUB.SubjectID = SSM.SubjectID
    WHERE SSM.StudentID = STD.StudentID
    FOR XML PATH(”), TYPE ).value(‘.[1]’, ‘nvarchar(max)’
    ), 1, 1, ”), ‘Not Assigned Yet’) AS SubjectList
    FROM dbo.Students STD

    Impact to query or better the results is dependent on your usage, if this was for display on an html page then no impact as the & would still display as & but if its not then you got html entities in your lists. where this has gotten me into issues before is when you need to then compare the csv string to something and “Algebra & Trig” does not equal “Algebra & Trig”

    Nice questions JeffSmile

  • Jeff Moden

    The answer to the question is “3” because LEN does not consider trailing blanks.

    Let’s continue in that same vein. What do the following return. As you say, answer without running the code.

    1. SELECT DATALENGTH(‘ABC ‘);
    2. SELECT LEN(N’ABC ‘);
    3. SELECT DATALENGTH(N’ABC ‘);

    Next question. If we change “Mathematics” to “Algebra & Trig” the “Subjects” table in your concatenation example, what fault will occur, how can it be fixed, and what is the impact to the query?

    If no one answers by Wednesday, I’ll provide the answers.