Friday, April 29, 2011

Finding best position for element in list

Hello,

I have List collection that is populated in specific order (the requirement is that, this order can not be changed). This list contains entity type objects.

After initial population of the list, I need to insert few more object, that are coming from another data source. These objects need to be inserted at specific position, so that sorting is correct.

For example if initial list has following elements

  1. AAA
  2. AAB
  3. AAC
  4. ACC
  5. ADA

After initial population I want to insert "ABB" element, it need to be inserted between 3 and 4.

At moment I have following method of finding correct position for new elements.

    private static int FindPositionForArticle(string word)        
    {
        string key = word.ToLower();
        for (int i = word.Length; i >= 0; i--)
        {
            if(i < word.Length)
                key = key.Remove(i, 1);

            int pos = 0;
            int insertPos = 0;
            foreach(ArticleEntity article in list)
            {
                if(article.Text.ToLower().StartsWith(key))
                    insertPos = pos;
                else if (!article.Text.ToLower().StartsWith(key) && insertPos > 0)
                    return insertPos++;
                pos++;
            }
        }
        return 0;
    }

The purpose idea behind this method:

  1. Take "word" that need to be inserted and try to find position of element with same name as "word"

  2. If nothing has been found, remove last character from "word" and search again.

  3. Repeat removal of last characters until best position has been found.

Unfortunately my methods has bugs(implemented incorrectly). Currently my methods suggests that best position would be 0, which is totally incorrect.

If You want to play with my example code You may download it at:

http://dl.getdropbox.com/u/204110/FindPosition.cs.txt

Thank You in advance.

From stackoverflow
  • int index = list.BinarySearch(word);
    

    If index is positive or zero, the item has been found in the list. If negative, it contains the bitwise complement of the index of the next highest item in the list.

    So, for this:

    List<string> list = new List<string>{"AAA","AAB","AAC","ACC","ADA"};
    int index = list.BinarySearch("ABB"); // => -4
    int insertIndex = ~index; // => 3
    list.Insert(insertIndex, "ABB");
    
    Daniil Harik : The problem is my list is actually public class ArticleEntityCollection : ObservableCollection May be I should refactor my code to simple List
    Chris Doggett : Yeah, unfortunately, ObservableCollection doesn't implement BinarySearch. If you really need the observable functionality, you could always derive from List and override the add/remove functions to fire events notifying that the list had changed.
    Daniil Harik : I've refactored to List. As I was not using ObservableCollection functionality. Your solution worked fine.
  • Have you considered using a SortedList data structure? If it will hold complex data types that need to be sorted in a certain specific way, you can create it specifying an IComparer object.

    The only thing the IComparer has to do is be able to determine if one item belongs before, or after another. Given that, the SortedList will keep the order correctly.

    Using this structure would guarantee that the ordering remains as you specify internally and that way you won't have to maintain code that essentially implements sorting.

    Daniil Harik : Could you please give an example of how such IComparer may look. Thank You very much.
    Ben S : It you're using strings and want them alphabetically sorted then you don't even need to bother with an IComparer. Otherwise, have a look at the MSDN page about it, it has a VB and a C# example: http://msdn.microsoft.com/en-us/library/system.collections.icomparer.aspx
  • Wouldn't a SortedList of type string work better?

  • This code:

        private static System.Collections.SortedList list = new System.Collections.SortedList();
    
        static void Main(string[] args)
        {
            list.Add("AAA" , "AAA");
            list.Add("AAB", "AAB");
            list.Add("AAC", "AAC");
            list.Add("ACC", "ACC");
            list.Add("ADA", "ADA");
            Console.WriteLine("List before");
            for (int j = 0; j < list.Count ; j++)
            {
                Console.WriteLine(list.GetByIndex(j));
            }
            list.Add("ABB","ABB");
    
    
            Console.WriteLine(list.IndexOfKey("ABB"));
    
            for (int j = 0; j < list.Count; j++)
            {
                Console.WriteLine(list.GetByIndex(j));
            }
    
    
            Console.ReadLine();
    
        }
    

    is Inserting the item where you need it..

    Or do you need something else?? there is no need for Icompare, you are sorting common strings.

    Daniil Harik : Thank You. Will try it out :)
  • Since that list you are using accessible via its index, you may want to implement a binary search similar to Chris Doggett's anwser. This will give you a better runtime than going through the list as you currently implemented it sequentially:

    Your runtime is currently linear to the number of items in the list, and if you insert n items like this you will get a runtime of about n2 operations.

    Using a binary search (as SortedList does internally by the way) you will end up with the much better runtime of log2(n) per insert, e.g. *n**log2(n) total runtime when inserting n items.

    The basic idea of a binary search is simple:

    1. List item

    2. First, you take the complete range as possible position (left = 0, right = current count in the list)

    3. If left equals right, you have found a suitable position

    4. Otherwise, pick the middle element (left+right)/2 and compare it to the key to be inserted.

      • If the result of the comparison is < 0, you set right to the middle index-1, therefore resetting the search range for the next iteration to the left half of the list
      • If the result is greater than 0, you set left to the middle index+1, so that the next search will be in the right part
      • Otherwise the comparison is 0 and you've got a duplicate, handle as you wish (ignore or insert at the same index), and break the loop (see 5.)
    5. Loop at 3.; the search range is now smaller...
  • I've done what you're trying to do like this. You have a list of your ArticleEntity objects. You need to sort this list. So first of all you need to add the ability to sort to your class. My examples are in VB, shouldn't be too hard to convert to C#.

    Add sorting to your class like so:

     Public Class ArticleEntity
        Implements IComparable(Of ArticleEntity)
    
        Public Overloads Function CompareTo(ByVal Other As ArticleEntity) As Integer _
            Implements IComparable(Of ArticleEntity).CompareTo
            Return Me._PropertyToSort.CompareTo(Other._PropertyToSort)
        End Function
    

    Then after you add everything you need to add to your List Of(ArticleEntity) you can:

    MyListOfArticleEntities.Sort()
    

    I think this is what you're looking for. Let me know if it's not.

  • Does your collection need to be a List?

    If you don't need to access elements from the list before inserting the new elements, a heap sounds like it would do the job better. Basically you would:

    1. Insert all your initial elements into the heap
    2. Later, insert the extra items in the exact same way
    3. Read out each item, in sorted order, from the heap into a List.

    Each of those steps is only O(log n) time per object. I'm certain that C# has an implementation of heaps.

    Note: I mean "heap" in the sense of the data structure, not in the sense of the global memory pool.

0 comments:

Post a Comment