[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: sysadmin qualifications (Re: apt-get vs. aptitude)



On 10/21/2013 5:26 PM, berenger.morel@neutralite.org wrote:
Le 18.10.2013 19:36, Jerry Stuckle a écrit :
On 10/18/2013 1:10 PM, berenger.morel@neutralite.org wrote:
Le 18.10.2013 17:22, Jerry Stuckle a écrit :
On 10/17/2013 12:42 PM, berenger.morel@neutralite.org wrote:
Le 16.10.2013 17:51, Jerry Stuckle a écrit :
I only know few people who actually likes them :)
I liked them too, at a time, but since I can now use standard smart
pointers in C++, I tend to avoid them. I had so much troubles with
them,
so now I only use them for polymorphism and sometimes RTTI.
I hope that someday references will become usable in standard
containers... (I think they are not because of technical problems,
but I
do not know a lot about that. C++ is easy to learn, but hard to
master.)


Good design and code structure eliminates most pointer problems;
proper testing will get the rest.  Smart pointers are nice, but in
real time processing they are an additional overhead (and an unknown
one at that since you don't know the underlying libraries).

Depends on the smart pointer. shared_ptr indeed have a runtime cost,
since it maintains additional data, but unique_ptr does not, afaik, it
is made from pure templates, so only compilation-time cost.


You need to check your templates.  Templates generate code.  Code
needs resources to execute.  Otherwise there would be no difference
between a unique_ptr and a C pointer.

In practice, you can replace every occurrence of std::unique_ptr<int> by
int* in your code. It will still work, and have no bug. Except, of
course, that you will have to remove some ".get()", ".release()" and
things like that here and there.
You can not do the inverse transformation, because you can not copy
unique_ptr.

The only use of unique_ptr is to forbid some operations. The code it
generates is the same as you would have used around your raw pointers:
new, delete, swap, etc.
Of course, you can say that the simple fact of calling a method have an
overhead, but most of unique_ptr's stuff is inlined. Even without
speaking about compiler's optimizations.


Even inlined code requires resources to execute.  It is NOT as fast
as regular C pointers.

I did some testing, to be sure. With -O3, the code is exactly the same.
Did not tried with -O1 and -O2. Without optimization, the 5 lines with
pointers were half sized of those using unique_ptr. But I never ship
softwares not optimized (the level depends on my needs, and usually I do
not use -O3, though).


First of all, with the -O1 and -O2 optimization you got extra code. That means the template DOES create more code. With -O3, your *specific test* allowed the compiler to optimize out the extra code. But that's only in your test; other code will not fare as well.

unique_ptr must manage the object at *run time* - not at *compile time*. To ensure uniqueness, there has to be an indication in the object that it is being managed by a unique_ptr object. Additionally, when the unique_ptr object is destroyed, the object being pointed to must also be destroyed.

Neither of these can be handled by the compiler. There must be run-time code associated with the unique_ptr to ensure the above.

You should look at the unique_ptr template code. It's not easy to read or understand (none of STL is). But you can see the code in it.

Plus, in an OS, there are applications. Kernels, drivers, and
applications.
Take windows, and say honestly that it does not contains
applications?
explorer, mspaint, calc, msconfig, notepad, etc. Those are
applications,
nothing more, nothing less, and they are part of the OS. They simply
have to manage with the OS's API, as you will with any other
applications. Of course, you can use more and more layers between
your
application the the OS's API, to stay in a pure windows environment,
there are (or were) for example MFC and .NET. To be more general,
Qt,
wxWidgets, gtk are other tools.


mspaint, calc, notepad, etc. have nothing to do with the OS. They
are just applications shipped with the OS.  They run as user
applications, with no special privileges; they use standard
application interfaces to the OS, and are not required for any other
application to run.  And the fact they are written in C is
immaterial.

So, what you name an OS is only drivers+kernel? If so, then ok. But
some
people consider that it includes various other tools which does not
require hardware accesses. I spoke about graphical applications,
and you
disagree. Matter of opinion, or maybe I did not used the good ones,
I do
not know.
So, what about dpkg in debian? Is it a part of the OS? Is not it a
ring
3 program? As for tar or shell?


Yes, the OS is what is required to access the hardware.  dpkg is an
application, as are tar and shell.

< snip >
Just because something is supplied with an OS does not mean it is
part of the OS.  Even DOS 1.0 came with some applications, like
command.com (the command line processor).


So, it was not a bad idea to ask what you name an OS. So, everything
which run in rings 0, 1 and 2 is part of the OS, but not softwares using
ring 3? Just for some confirmation.


Not necessarily.  There are parts of the OS which run at ring 3, also.

What's important is not what ring it's running at - it's is the code
required to access the hardware on the machine?

I disagree, but it is not important, since at least now I can use the
word in the same meaning as you, which is far more important.

But all of this have nothing related to the need of understanding
basics
of what you use when doing a program. Not understanding how a
resources
you acquired works in its big lines, imply that you will not be
able to
manage it correctly by yourself. It is valid for RAM memory, but
also
for CPU, network sockets, etc.


Do you know how the SQL database you're using works?

No, but I do understand why comparing text is slower than integers on
x86 computers. Because I know that an int can be stored into one word,
which can be compared with only one instruction, while the text will
imply to compare more than one word, which is indeed slower. And it
can
even become worse when the text is not an ascii one.
So I can use that understanding to know why I often avoid to use
text as
keys. But it happens that sometimes the more problematic cost is
not the
speed but the memory, and so sometimes I'll use text as keys anyway.
Knowing what is the word's size of the SQL server is not needed to
make
things work, but it is helps to make it working faster. Instead of
requiring to buy more hardware.


First of all, there is no difference between comparing ASCII text and
non-ASCII text, if case-sensitivity is observed.

Character's size, in bits. ASCII uses 7 bits, E-ASCII uses 8, UTF8 = 8,
UTF16 = 16, etc. It have an impact, for both memory, bandwidth and
instruction sets used.


But ASCII, even if it only uses 7 bits, is stored in an 8 bit byte.
A 4 byte ASCII character will take up exactly the same amount of room
as a 32 bit integer.  And comparison can use exactly the same machine
language instructions for both.

Be fair. Compare the max number you can represent with a uint32_t with
the max you can represent with a char*.
For int8_t and char:
256 values versus 10 if we limit ourselves to numbers. If we limit
ourselves to printable ASCII characters, ok, we can have... a little
more than 100, but what a strange numeric base it will be...


The max number of values you can store in a char is 2^8. The max you can represent with a 4 byte character field is exactly the same as that of an int. Both have 32 bits, so it's 2^32. Nothing says a char* field has to contain only the digits 0-9 (or A-Z for that matter).

The exact same set
of machine language instructions is generated.  However, if you are
doing a case-insensitive comparison, ASCII is definitely slower.

And saying "comparing text is slower than integers" is completely
wrong.  For instance, a CHAR(4) field can be compared just as quickly
as an INT field, and CHAR(2) may in fact be faster, depending on many
factors.

It is partially wrong. Comparing a text of 6 characters will be slower
than comparing short.
6 characters: "-12345" and you have the same data on only 2 bytes.


I didn't say 6 characters.  I SPECIFICALLY said 4 characters - one
case where your "strings take longer to compare than integers) is
wrong.

With you char[4], you can only go to 9999, even a short can represent
more, and so, compute (including comparison) faster.
Now, you say I'm completely wrong, when you take 1 case where I am
wrong. I agreed with you that I was partially, but by saying that
comparing text is slower than comparing integers, I did not specifically
same binary size. Plus, in the SQL context, I thought it was obvious
that I was referring to the fact that sometimes integers are good
replacements for primary keys.


And what says you are limited to digits? And since both are 4 byte fields, they can hold exactly the same number of bits, and therefore the same number of values. Now often a char[4] will only contain alphanumeric values, which limits the number of values it does contain. But either way, he comparison is exactly the same.

I never said integers weren't good replacements as primary keys; I just disagreed with your statement that char fields always take longer for comparison. They do not.

But, fine. You said, without taking uppercase/lowercase, that's the same?
So, how would you sort texts which includes, for example: éèê ?
Those chars are greater than 'f' if you take only their numerical
values. But then, results would be... just wrong. They should, at least
in French, be considered equal with the 'e'. That's what users would
expect. So the comparison of texts will need replacement and then
comparison, for the simpler (and the worse, since it does not sort the
accentuated characters themselves plus it loses informations) solution.
And we have plenty of characters like that here, that we use on lot of
words (and names, indeed).


How can you do it in an integer field?

But now you're changing the rules (again), bringing into play localization. I said nothing about that in my previous statements. A plain char[4] field does not include localization.

Texts are directly linked with localization, so you can not simply
compare them as if they were integers. It only works with English ( or
maybe there is another language which uses only 26 letters without
accents, but I do not know about it). The old trick to consider chars as
bytes should not be taught longer in my opinion. Students could use it
real situations where there are accentuated characters, and cause bugs
for nothing.


Text is not necessarily linked to localization. Maybe in France they are, but not here in the U.S.

Plus, CHAR(4) is not necessarily coded on 4 bytes. Characters and bytes
are different notions.


In any current database, CHAR(4) for ASCII data is encoded in 4
bytes. Please show where that is not the case.

Ok, you got me one the char(4) stuff. All that time I was thinking about
text, but forgot to be explicit enough. I thought I mentioned unicode
somewhere... but I'm too lazy to check it.


I was talking specifically about char(4). I did not mention unicode or other character sets.

As a side note, I'm surprised. CHAR does not seems to be in the standard:
http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&ved=0CCwQFjAA&url=http%3A%2F%2Fjtc1sc32.org%2Fdoc%2FN1951-2000%2F32N1964T-text_for_ballot-FCD_9075-2.pdf&ei=6JBlUoPXNoa00QWCrIGYAw&usg=AFQjCNHiJl_XShEUGPmObfmrji81RtDVNg&sig2=1vIOHIp64_oLVO8rIuMjIA&bvm=bv.54934254,d.d2k



It is. See page 151 - CHAR is a reserved word. Page 177 - CHAR can be used in place of CHARACTER.

But if an extra 4 byte key is going to cause you memory problems,
you're hardware is already undersized.

Or your program could be too hungry, because you did not know that you
have a limited hardware.


As I said - your hardware is already undersized.  If adding 4 bytes
to a row is going to cause problems now, you'll have even greater
problems later.

Smaller stuff is often better. It is not a problem of hardware being
undersized.


Smaller is better within limits. Clarity and portability are more important, IMHO.

And here, we are not in the simple efficiency, but to something which
can make an application completely unusable, with "random" errors.


Not at all.  Again, it's a matter of understanding the language you
are using.  Different languages have different limitations.

So it must be that C's limitations are not fixed enough, because size
types can vary according to the hardware (and/or compiler).


Sure.  And you need to understand those limitations.

Indeed. And those are dependent on hardware and compiler, for the C and
C++ languages at least.


No, they are completely dependent on the compiler being used.

And BTW - even back in the days if 16 bit PC's, C compilers still
used 32 bit ints.

I can remember having used borland TurboC (can't remember the version, I
do not even remember if it was able to support C++), and it's "int" type
was a short. 16 bits. It is when I switched to another (more recent one)
compiler that I stopped using int, for this exact reason. And when I
discovered stdint.h, I simply immediately loved it. No more surprises,
no implicitly defined sizes.


Yes, and long was 32 bits. Check the C and C++ specs. The only thing they say is the short <= int <= long. There is no reason an int could not have been 32 bits; it's just the way Borland defined it in their compiler.

So, ok, if you can find a job when you have every single low level
feature you will require through high level functions/objects, having
knowledge of on what you are sit on is useless. Maybe I am wrong
because
I actually am interested by knowing what is behind the screen, and not
only by the screen itself. But still, if you only know about your own
stuff, and the man who will deploy it only knows about his own stuff,
won't you need a 3rd person to allow you to communicate? Which imply
loss of time.

No, it's called a DESIGN - which, amongst other things, defines the
communications method between the two.  I've been on projects with up
to 100 or so programmers working on various pieces.  But everything
worked because it was properly designed from the outset and every
programmer knew how his/her part fit into the whole programs.

I do not think that most programmers work in teams of hundreds of
people. But I may be wrong. I do not know.


I didn't say most did.  I DID say they exist, for large projects.

I never said you said all did. I simply said I do not know the average
size of programmer's teams around the world.

I think, but it's pure guessing based on my small experience, that IT
services with more than 30 persons in R&D are not the most common
situations, and if I am right, then programmers... no, people, have to
be able to interact with other teams which are doing different things.
And to interact, you need to be able to speak the same language.
At least, that's what I have seen in my experiences. But, indeed, my cv
is far less impressive than yours, and I would never use it to prove
that I am true.

(note: we were exactly 6 programmers. There were 3 sysadmins, 1 lead
project, and 3 others were for support.)


In most companies, programmers are not part of R&D. They are there to support the business. For instance, in an insurance company, they write programs to support customers, transactions, various policy types offered by the company, billing, accounting, payroll - on and on. Engineering firms (like it sounds like you work for) have a much higher percentage in R&D, often because they are smaller and use a lot more packaged solutions to run their businesses, whereas there are no packaged solutions to support their engineering needs (i.e. ARM controllers, etc.).

In my last job, when we had something to release, we usually talked
directly with the people who had then to deploy it, to explain them
some
requirements and consequences, that were not directly our programmer's
job. Indeed, I was not employed by microsoft, google or IBM, but very
far from that, we were less than 10 dev.
But now, are most programmers paid by societies with hundreds of
programmers?


In the jobs I've had, the programmers have never had to talk to the
deployers.  Both were given the design details; the programmers wrote
to the design and the deployers generated the necessary scripts do
deploy what the design indicated.  When the programmers were done, the
deployers were ready to install.

Maybe you worked only in big structures, or maybe this one was doing
things wrong. But the IT team was quite small, if we only consider
sysadmins, dev, project leads and few other roles. Less than 15 persons.


Even 15 person teams can do it right.  Unfortunately, too many
companies (both big and small) won't do it right.  A major reason why
there are so many bugs out there.  Also a major reason why projects go
over time and over budget.

Sorry, but I do not see what's wrong with having communication between
various actors of a project.
I have followed various classes of various good ways to manage projects,
stuff about ITIL for example, and I can not remember having learn that
part of the process was wrong in my last job. Other were, and not only a
little, but I do not think having learn that something was wrong in that
part. But that was only lessons, and I never read the whole ITIL stuff.



There's no problem with having communications with small groups of programmers. But when you have > 100 programmers working on a project, you can't have all of them communicating with each other - nothing would get done. So you break them up into small teams and assign each group a piece of the program. Those people talk together, and the team
leaders talk together.  It is much more manageable.

But "doing it right" has very little to do with team size. Rather, it has to do with management.

(Long story)
I was brought in to manage a project one time; this team had a history of going over time and over budget, and producing code which had lots of bugs. The company didn't understand why; they had good programmers and the requirements were clear.

It became clear to me very early on when I got the requirements and brought in the leaders to start designing the project. I was told by the leaders they needed to be programming - as their team members did - per management. I spoke to the manager and he told me flat out "If my programmers aren't writing code, they are not being productive". My response was "If they don't know what they're writing, how can that be productive?" Turns out they had people working at different targets as to how it was to be done; the results weren't pretty. Lots of code had to be rewritten, sometimes from scratch (actually the best), or heavily modified to work with what someone else had done. And each modification introduced the probability of new bugs.

To make a long story short, I had one heck of a fight with the manager and almost walked off the job. But I finally convinced him to do it my way (with the help of upper management, who had brought me into make it work). But I got him to let us do the design (and included the programmers in the process - it was a small team). The result was a project which came in under budget before the deadline and had few bugs (I also instituted a quality control process to check code along the way).

To be fair, it wasn't all the manager's fault. He was under a lot of pressure from above to do better. But never having been a programmer himself, he had never seen how a well-oiled operation worked. But he never listened to the couple of guys on his team who tried to convince him of how it should run before I was brought in.

The final result was the programmers and manager got nice bonuses for getting this project down right, and I got a pink slip :) Such is the life of a consultant.

Jerry


Reply to: