[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: help with C algorythm (find unique value in an array) could you please make changes



this is really a reply to Mike:

On Tue, Jan 23, 2007 at 06:01:34AM -0500, Roberto C. Sanchez wrote:
> On Mon, Jan 22, 2007 at 11:36:18PM -0500, Mike Polyakov wrote:
> > Michael,
> > 
> > >Why not just use a std::set<int> here?  Repeated inserts of the same
> > >value will be ignored.
> > 
> > True, but that will use extra memory. Since pointers are iterators,
> > this can be done on ordinary array in place without extra memory. Only
> > thing is that unique() function modifies the original array. I have
> > used back_insert_iterator with extra class to keep track of the number
> > of unique elements, but now the code is not neat anymore:
> > 
> I think that std::set<> is the better approach.  IIRC, the standard does
> not mandate how it is implemented, so it is very likely implemented as
> an array (at least for primitive types).  If the set is unordered (that
> is, you don't expect to get things back out in the same order as you
> insert them), then the implementation is free to sort the integers as
> they are inserted.  This makes duplicate insertions happen faster than
> O(N), since the whole thing would start off with a binary search.  So,
> if you plan to have lots of duplicates, this is probably more efficient
> overall since you plan to find the element already there and the more
> expensive actual insertion, likely O(N log N), would happen less
> frequently.
> 
Mike,

Roberto says several true things, but is perhaps too polite about your
misunderstandings of the STL. Here are some things of which you should
be aware:

std::set<int> does _not_ waste memory.

You should use setname.insert( avalue ) to insert avalue into
setname. It succeeds only if avalue is not already in the set.

You should use setname.size() to learn how many unique values there
are.

You cannot learn from any memberfunction of set<int> how many times
you tried to insert values that were already in the container. Only
unique values are kept in a STL set.

The STL standard requires that operations be completed by algorithms
that meet stated complexity bounds, expressed in big-O notation. The
big-O constraints on the set<T> containers pretty much dictate that
red-black trees be used for the implementation. Internally, the
elements of a STL set must be ordered, and there is a way for the
user to define a special version of 'less()' for sets of complicated
abstract data types. For int, the standard operator< is quite sufficient.
If you interate through a set<int> you will find that the values always
are returned in ascending order, regardless of the order in which they
were inserted. If they are not in ascending order, you are using a
very strange and suspect implementation of STL.

The only reason for _not_ using std::set<int> is that you can't, for some
reason, gain legal access to a good implementation. But Debian has solved
this problem for you.

Nicolai M. Josuttis, The C++ Standard Library, covers all this and much
more. It is expensive, but perhaps you can find a copy at a nearby 
university library. 

HTH
-- 
Paul E Condon           
pecondon@mesanetworks.net



Reply to: