Sorting characters in varchar2 alphabetically

I'm looking for a function that sorts the characters in varchar2 alphabetically.

Is there something built in to oracle that I can use, or do I need to create a UI in PL / SQL?

+2


a source to share


5 answers


From the answer at http://forums.oracle.com/forums/thread.jspa?messageID=1791550 this might work, but no need to test 10g ...

SELECT MIN(permutations)
FROM (SELECT REPLACE (SYS_CONNECT_BY_PATH (n, ','), ',') permutations
    FROM (SELECT LEVEL l, SUBSTR ('&col', LEVEL, 1) n
        FROM DUAL
        CONNECT BY LEVEL <= LENGTH ('&col')) yourtable
    CONNECT BY NOCYCLE l != PRIOR l)
WHERE LENGTH (permutations) = LENGTH ('&col')

      



The example is col

defined in SQL * Plus, but if you make this function you can pass it or you can rework it so that I can use a table column I suppose.

I would take this as a starting point, not a solution; the original question was about anagrams, so it is meant to find all permutations, so something similar is possible, but simplified. I suspect this doesn't scale well for large values.

+1


a source


You must remember that there is no general agreement on what "alphabetically" means. It all depends on what country it is in and who is looking at your data and what context it is in.

For example, in DK there are a large number of different sortings a, aa, b, c, æ, ø, å



  • alphabetically: a, aa, b, c, æ, ø, å
  • for some dictionary: a, aa, å, b, c, æ, ø
  • for other dictionaries: a, b, c, æ, ø, aa, å
  • according to Microsoft standard: a, b, c, æ, ø, aa, å

check out http://www.siao2.com/2006/04/27/584439.aspx for more information. Which also happens to be a great blog for issues like these.

+1


a source


So in the end I went down the PL / SQL path because after some searching I realized that there is no built-in function I can use.

Here's what I came up with. It is based on the future of associative array, which is that Oracle stores keys in sorted order.

create or replace function sort_chars(p_string in varchar2) return varchar deterministic
as
     rv varchar2(4000);
     ch  varchar2(1);
     type vcArray is table of varchar(4000) index by varchar2(1);
     sorted vcArray;

     key varchar2(1);

begin
     for i in 1 .. length(p_string)
     loop
        ch := substr(p_string, i, 1);

        if (sorted.exists(ch))
        then 
            sorted(ch) := sorted(ch) || ch;
        else
            sorted(ch) := ch;
        end if;
     end loop;


    rv := '';
    key  := sorted.FIRST;
    WHILE key IS NOT NULL LOOP
        rv := rv || sorted(key);
        key := sorted.NEXT(key);
    END LOOP;

     return rv;
end;

      

Simple performance test:

set timing on;

create table test_sort_fn as 
select t1.object_name || rownum as test from user_objects t1, user_objects t2;

select count(distinct test) from  test_sort_fn;

select count (*)  from (select sort_chars(test)  from test_sort_fn);


Table created.
Elapsed: 00:00:01.32

COUNT(DISTINCTTEST)
-------------------
             384400
1 row selected.
Elapsed: 00:00:00.57

  COUNT(*)
----------
    384400
1 row selected.
Elapsed: 00:00:00.06

      

+1


a source


Assuming you don't mind having characters returned 1 per line:

select substr(str, r, 1) X from (
select 'CAB' str,
       rownum r
from dual connect by level <= 4000
) where r <= length(str) order by X;

X
=
A
B
C

      

0


a source


You can use the following query:

select listagg(letter) 
    within group (order by UPPER(letter), ASCII(letter) DESC) 
from
(
select regexp_substr('gfedcbaGFEDCBA', '.', level) as letter from dual
connect by regexp_substr('gfedcbaGFEDCBA', '.', level) is not null
);

      

The subquery splits the string into records (one character each) using regexp_substr, and the outer query concatenates the records into one line using listagg after sorting them.

You have to be careful here, because sorting alphabetically depends on your database configuration, as Sinus pointed out.

In the example above, the letters are sorted in ascending "alphabetical order" and descended in ascii code, which in my case results in "aAbBcCdDeEfFgG". The result in your case may be different.

You can also sort letters using nlssort - this will give you better control over the sort order as you gain independence from your database configuration.

select listagg(letter) 
    within group (order by nlssort(letter, 'nls_sort=german') 
from
(
select regexp_substr('gfedcbaGFEDCBA', '.', level) as letter from dual
connect by regexp_substr('gfedcbaGFEDCBA', '.', level) is not null
);

      

The query above will also give you "aAbBcCdDeEfFgG", but if you change "German" to "Spanish" you will get "AaBbCcDdEeFfGg" instead.

0


a source







All Articles