Re: sysadmin qualifications (Re: apt-get vs. aptitude)
Le 23.10.2013 04:04, Jerry Stuckle a écrit :
On 10/22/2013 8:47 PM, berenger.morel@neutralite.org wrote:
Le 22.10.2013 23:01, Jerry Stuckle a écrit :
On 10/21/2013 5:26 PM, berenger.morel@neutralite.org wrote:
Le 18.10.2013 19:36, Jerry Stuckle a écrit :
On 10/18/2013 1:10 PM, berenger.morel@neutralite.org wrote:
Le 18.10.2013 17:22, Jerry Stuckle a écrit :
On 10/17/2013 12:42 PM, berenger.morel@neutralite.org wrote:
Le 16.10.2013 17:51, Jerry Stuckle a écrit :
I only know few people who actually likes them :)
I liked them too, at a time, but since I can now use
standard
smart
pointers in C++, I tend to avoid them. I had so much
troubles with
them,
so now I only use them for polymorphism and sometimes RTTI.
I hope that someday references will become usable in
standard
containers... (I think they are not because of technical
problems,
but I
do not know a lot about that. C++ is easy to learn, but hard
to
master.)
Good design and code structure eliminates most pointer
problems;
proper testing will get the rest. Smart pointers are nice,
but in
real time processing they are an additional overhead (and an
unknown
one at that since you don't know the underlying libraries).
Depends on the smart pointer. shared_ptr indeed have a runtime
cost,
since it maintains additional data, but unique_ptr does not,
afaik, it
is made from pure templates, so only compilation-time cost.
You need to check your templates. Templates generate code.
Code
needs resources to execute. Otherwise there would be no
difference
between a unique_ptr and a C pointer.
In practice, you can replace every occurrence of
std::unique_ptr<int> by
int* in your code. It will still work, and have no bug. Except,
of
course, that you will have to remove some ".get()", ".release()"
and
things like that here and there.
You can not do the inverse transformation, because you can not
copy
unique_ptr.
The only use of unique_ptr is to forbid some operations. The
code it
generates is the same as you would have used around your raw
pointers:
new, delete, swap, etc.
Of course, you can say that the simple fact of calling a method
have an
overhead, but most of unique_ptr's stuff is inlined. Even
without
speaking about compiler's optimizations.
Even inlined code requires resources to execute. It is NOT as
fast
as regular C pointers.
I did some testing, to be sure. With -O3, the code is exactly the
same.
Did not tried with -O1 and -O2. Without optimization, the 5 lines
with
pointers were half sized of those using unique_ptr. But I never
ship
softwares not optimized (the level depends on my needs, and
usually I do
not use -O3, though).
First of all, with the -O1 and -O2 optimization you got extra code.
Did you try it? It just did, with a code doing simply a new and a
delete
with raw against unique_ptr. In short, the simplest usage possible.
Numbers are optimization level, p means pointer and u means
unique_ptr.
It seems that it is the 2nd level of optimization which removes the
difference.
Which is why your -O3 was able to optimize the extra code out. A
more complicated test would not do that.
I'll try to make a more complicated test today, which uses all common
stuff between raw and unique pointers.
7244 oct. 23 01:57 p0.out
6845 oct. 23 01:58 p1.out
6845 oct. 23 01:58 p2.out
6845 oct. 23 01:58 p3.out
11690 oct. 23 01:59 u0.out
10343 oct. 23 01:59 u1.out
6845 oct. 23 01:59 u2.out
6845 oct. 23 01:59 u3.out
That means the template DOES create more code. With -O3, your
*specific test* allowed the compiler to optimize out the extra
code.
But that's only in your test; other code will not fare as well.
Indeed it adds code. But what is relevant is what you will release,
and
if by adding some switches (I am interested in that stuff now, but
too
tired from now. Tomorrow I'll make testing with various switches to
know
which one exactly allows to have those results, plus a better
testing
code. Sounds like a good occasion to learn few things.) you have the
same final results, then it is not a problem, at least for me.
Only in your simple case.
Now, I have never found any benchmark trying to compare raw pointers
and
unique_ptr, could be interesting to have real numbers instead of
assumptions. I'll probably do that tomorrow.
I don't care about benchmarks in such things. If I need a
unique_ptr, I use a unique_ptr. If I don't, I (may) use a raw
pointer.
I do not really like benchmarks, to be honest, but if I can learn
something by writing some testing code, then why not.
If there is a performance problem later, I will find the problem and
fix it. But I don't prematurely optimize.
Agree.
Additionally,
when the unique_ptr object is destroyed, the object being pointed
to
must also be destroyed.
If you do not provide an empty deleter, you are right. This is the
default behavior. But you can provide one, for example if you need
to
interface with C libraries, like SDL.
The object is destroyed, whether you provide a destructor or not. If
you do not provide one, the compiler provides an empty one.
I said, if you provide an empty deleter, not destructor. And that would
anyway be stupid, since if you need an empty deleter, you do not need
unique_ptr. But this mechanism is especially useful when you want to
simplify your life with C libraries or old C++ libraries.
For example, to manage the SDL_Surface objects, you can simply pass
it
with something like this: "std::unique_ptr< SDL_Surface,
SDL_FreeSurface> surface_ptr;".
Or you can provide an empty deleter.
Or you can use the release method before destruction.
But anyway, unique_ptr are made to automate RAII, so deleting
automatically is a good thing. I can not see why you could want to
use a
unique_ptr and not allow it to delete things... That would be the
same
as using a vector and never using it's capacity to be a dynamic
container. Instead, use a raw pointers or references.
You don't. That's one of the purposes of a unique_ptr.
Indeed. That's why I did not really understood your point here.
Neither of these can be handled by the compiler. There must be
run-time code associated with the unique_ptr to ensure the above.
As I said, the destructor compiled with O3 have no need to runtime
code
(same for O2).
Same for allocations and constructors.
Now, if you have any code where what I said is wrong, I would be
happy
to take a look at it.
You never *need* a constructor or a destructor, unless you need to do
things yourself. But if you do not provide them, the compiler will
provide dummy ones for you. This satisfies the C++ requirement that
all objects have both a constructor and a destructor.
I know that perfectly. Well, to be more exact, you *need* to make
default/copy/move constructors when you added a non-default one.
But it is not related to the fact that default constructors and
destructors will not have overhead compared to initialize the same
variables yourself, except that the compiler will have to choose between
size or speed by inlining them or not.
You should look at the unique_ptr template code. It's not easy to
read or understand (none of STL is). But you can see the code in
it.
I do it quite regularly, and honestly, for many parts it is not as
complex as people usually says.
Unique_ptr is a good example there, it would be really easy to read
if
programmers did not messed the code with tabulation and spaces for
indentation. And if you use the classic setup of 8 spaces per tab,
you
won't have my problem (I usually uses 2 spaces for terminals, and 4
on
graphical applications).
For other parts, it may be harder, but still, if you understand how
templates works that's not so hard when you are accustomed to their
naming conventions. Plus, that code is not too badly documented, so
it
makes things not so hard.
Indeed, it is not easy to read for people not used to C++ syntax,
but I
allows myself to think that I am no longer a beginner with this
language.
STL is a very poor example on how to code. Bad naming conventions,
even worse documentation...
Sure, you can spend a lot of time trying to decode it, but good code
doesn't need that much work. Even after about 25 years of C++ I find
myself spending too much time trying to read it when I have to.
Indeed, it's far to be the best C++ code I have seen. But, it's not the
STL itself, it's GCC's STL. I did not used a lot of compilers and so STL
implementations, but some other could be better.
But all of this have nothing related to the need of
understanding
basics
of what you use when doing a program. Not understanding how
a
resources
you acquired works in its big lines, imply that you will not
be
able to
manage it correctly by yourself. It is valid for RAM memory,
but
also
for CPU, network sockets, etc.
Do you know how the SQL database you're using works?
No, but I do understand why comparing text is slower than
integers on
x86 computers. Because I know that an int can be stored into
one
word,
which can be compared with only one instruction, while the
text will
imply to compare more than one word, which is indeed slower.
And it
can
even become worse when the text is not an ascii one.
So I can use that understanding to know why I often avoid to
use
text as
keys. But it happens that sometimes the more problematic cost
is
not the
speed but the memory, and so sometimes I'll use text as keys
anyway.
Knowing what is the word's size of the SQL server is not
needed to
make
things work, but it is helps to make it working faster.
Instead of
requiring to buy more hardware.
First of all, there is no difference between comparing ASCII
text and
non-ASCII text, if case-sensitivity is observed.
Character's size, in bits. ASCII uses 7 bits, E-ASCII uses 8,
UTF8
= 8,
UTF16 = 16, etc. It have an impact, for both memory, bandwidth
and
instruction sets used.
But ASCII, even if it only uses 7 bits, is stored in an 8 bit
byte.
A 4 byte ASCII character will take up exactly the same amount of
room
as a 32 bit integer. And comparison can use exactly the same
machine
language instructions for both.
Be fair. Compare the max number you can represent with a uint32_t
with
the max you can represent with a char*.
For int8_t and char:
256 values versus 10 if we limit ourselves to numbers. If we limit
ourselves to printable ASCII characters, ok, we can have... a
little
more than 100, but what a strange numeric base it will be...
The max number of values you can store in a char is 2^8. The max
you
can represent with a 4 byte character field is exactly the same as
that of an int. Both have 32 bits, so it's 2^32. Nothing says a
char* field has to contain only the digits 0-9 (or A-Z for that
matter).
I know that pretty well.
If you use chars as very short integers, you are right. I did that
sometimes, too, when I did not known about uint8_t and int8_t.
But when you use them for real text, you only rarely will use values
like, say, 0x01 to 0x10. I am not very precise, and that range could
contain the \t. I do not remember it's ascii value... (and IIRC,
0x10 is
\r or \n, not sure which one. But you probably know what I mean
here.).
Technically, everything is only bits, 0 and 1. But when I see a
char[4]
I expect it to contains printable characters, not the equivalent to
an int.
That's one difference. I don't expect it to necessarily contain
printable characters. I expect it only to contain 4 characters worth
of information. Documentation will say whether it is printable
characters or not.
The exact same set
of machine language instructions is generated. However, if you
are
doing a case-insensitive comparison, ASCII is definitely
slower.
And saying "comparing text is slower than integers" is
completely
wrong. For instance, a CHAR(4) field can be compared just as
quickly
as an INT field, and CHAR(2) may in fact be faster, depending
on many
factors.
It is partially wrong. Comparing a text of 6 characters will be
slower
than comparing short.
6 characters: "-12345" and you have the same data on only 2
bytes.
I didn't say 6 characters. I SPECIFICALLY said 4 characters -
one
case where your "strings take longer to compare than integers) is
wrong.
With you char[4], you can only go to 9999, even a short can
represent
more, and so, compute (including comparison) faster.
Now, you say I'm completely wrong, when you take 1 case where I am
wrong. I agreed with you that I was partially, but by saying that
comparing text is slower than comparing integers, I did not
specifically
same binary size. Plus, in the SQL context, I thought it was
obvious
that I was referring to the fact that sometimes integers are good
replacements for primary keys.
And what says you are limited to digits? And since both are 4 byte
fields, they can hold exactly the same number of bits, and
therefore
the same number of values. Now often a char[4] will only contain
alphanumeric values, which limits the number of values it does
contain. But either way, he comparison is exactly the same.
I never said integers weren't good replacements as primary keys; I
just disagreed with your statement that char fields always take
longer
for comparison. They do not.
But, fine. You said, without taking uppercase/lowercase, that's
the
same?
So, how would you sort texts which includes, for example: éèê ?
Those chars are greater than 'f' if you take only their numerical
values. But then, results would be... just wrong. They should, at
least
in French, be considered equal with the 'e'. That's what users
would
expect. So the comparison of texts will need replacement and then
comparison, for the simpler (and the worse, since it does not sort
the
accentuated characters themselves plus it loses informations)
solution.
And we have plenty of characters like that here, that we use on
lot of
words (and names, indeed).
How can you do it in an integer field?
But now you're changing the rules (again), bringing into play
localization. I said nothing about that in my previous statements.
A
plain char[4] field does not include localization.
Indeed, if you use characters as if they were numbers... correct
syntax,
yes. But I wonder about the semantic.
Absolutely nothing wrong with it. Works fine.
Texts are directly linked with localization, so you can not simply
compare them as if they were integers. It only works with English
( or
maybe there is another language which uses only 26 letters without
accents, but I do not know about it). The old trick to consider
chars as
bytes should not be taught longer in my opinion. Students could
use it
real situations where there are accentuated characters, and cause
bugs
for nothing.
Text is not necessarily linked to localization. Maybe in France
they
are, but not here in the U.S.
As I said, in English it is obviously not need needed, since you
simply
have to use ASCII, and things are made automatically. But
localization
is something that every other countries have to work with, if they
have
accentuated characters. Of course, I only speak about about latin
languages, for other languages I have not idea.
That's where I admit virtually all of my experience (even when I was
in Hong Kong and Germany) was with latin_1 (American) with no
accented
characters. But then I've always worked with American firms, even
overseas.
Plus, CHAR(4) is not necessarily coded on 4 bytes. Characters
and
bytes
are different notions.
In any current database, CHAR(4) for ASCII data is encoded in 4
bytes. Please show where that is not the case.
Ok, you got me one the char(4) stuff. All that time I was thinking
about
text, but forgot to be explicit enough. I thought I mentioned
unicode
somewhere... but I'm too lazy to check it.
I was talking specifically about char(4). I did not mention
unicode
or other character sets.
And at start, I was speaking about texts.
As a side note, I'm surprised. CHAR does not seems to be in the
standard:
http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&ved=0CCwQFjAA&url=http%3A%2F%2Fjtc1sc32.org%2Fdoc%2FN1951-2000%2F32N1964T-text_for_ballot-FCD_9075-2.pdf&ei=6JBlUoPXNoa00QWCrIGYAw&usg=AFQjCNHiJl_XShEUGPmObfmrji81RtDVNg&sig2=1vIOHIp64_oLVO8rIuMjIA&bvm=bv.54934254,d.d2k
It is. See page 151 - CHAR is a reserved word. Page 177 - CHAR
can
be used in place of CHARACTER.
Oh, did not seen it. Thanks.
But if an extra 4 byte key is going to cause you memory
problems,
you're hardware is already undersized.
Or your program could be too hungry, because you did not know
that you
have a limited hardware.
As I said - your hardware is already undersized. If adding 4
bytes
to a row is going to cause problems now, you'll have even greater
problems later.
Smaller stuff is often better. It is not a problem of hardware
being
undersized.
Smaller is better within limits. Clarity and portability are more
important, IMHO.
I said often, right?
Yes, and I just said I don't worry about size unless it is a problem.
And here, we are not in the simple efficiency, but to
something
which
can make an application completely unusable, with "random"
errors.
Not at all. Again, it's a matter of understanding the language
you
are using. Different languages have different limitations.
So it must be that C's limitations are not fixed enough, because
size
types can vary according to the hardware (and/or compiler).
Sure. And you need to understand those limitations.
Indeed. And those are dependent on hardware and compiler, for the
C and
C++ languages at least.
No, they are completely dependent on the compiler being used.
And BTW - even back in the days if 16 bit PC's, C compilers still
used 32 bit ints.
I can remember having used borland TurboC (can't remember the
version, I
do not even remember if it was able to support C++), and it's
"int" type
was a short. 16 bits. It is when I switched to another (more
recent one)
compiler that I stopped using int, for this exact reason. And when
I
discovered stdint.h, I simply immediately loved it. No more
surprises,
no implicitly defined sizes.
Yes, and long was 32 bits. Check the C and C++ specs. The only
thing they say is the short <= int <= long. There is no reason an
int
could not have been 32 bits; it's just the way Borland defined it
in
their compiler.
I never said that their behavior was standard or not standard. I
simply
explained why now I try to avoid using short, int and long, which
have
different behaviors depending on the tools, and why I think C99 was
a
nice improvement to bring in stdint.h (which MS does not include...
I
won't comment that here).
But even with that standard, I have never seen any learning document
(
or any lesson ) encouraging their uses. As for some other very
important
stuff in programming, I had to discover that myself. Not a problem,
of
course, when I chose programming as a job, I known that, but
still...
Once again, it's all about knowing your tools. We teach their use,
and tell people what the standard is, and how to pick the appropriate
type.
The teachers I had did not worked that way. To be honest, I do not
remember having any fun with programming when I had their lessons. I was
almost disgusted from programming, because we spent more time to draw
applications with those dirty RAD tools than to really program. And the
knowledge I could have acquired about knowing which method changes the
color of the text or things like that is now useless, since MFC and Qt3
are no longer used. But that's another story.
But even then it can be a problem. For instance, there are
differences between short, int and long in 32 and 64 bit versions of
many C and C++ compilers.
So, ok, if you can find a job when you have every single low
level
feature you will require through high level functions/objects,
having
knowledge of on what you are sit on is useless. Maybe I am
wrong
because
I actually am interested by knowing what is behind the screen,
and not
only by the screen itself. But still, if you only know about
your
own
stuff, and the man who will deploy it only knows about his own
stuff,
won't you need a 3rd person to allow you to communicate? Which
imply
loss of time.
No, it's called a DESIGN - which, amongst other things, defines
the
communications method between the two. I've been on projects
with up
to 100 or so programmers working on various pieces. But
everything
worked because it was properly designed from the outset and
every
programmer knew how his/her part fit into the whole programs.
I do not think that most programmers work in teams of hundreds
of
people. But I may be wrong. I do not know.
I didn't say most did. I DID say they exist, for large projects.
I never said you said all did. I simply said I do not know the
average
size of programmer's teams around the world.
I think, but it's pure guessing based on my small experience, that
IT
services with more than 30 persons in R&D are not the most common
situations, and if I am right, then programmers... no, people,
have to
be able to interact with other teams which are doing different
things.
And to interact, you need to be able to speak the same language.
At least, that's what I have seen in my experiences. But, indeed,
my cv
is far less impressive than yours, and I would never use it to
prove
that I am true.
(note: we were exactly 6 programmers. There were 3 sysadmins, 1
lead
project, and 3 others were for support.)
In most companies, programmers are not part of R&D. They are there
to support the business. For instance, in an insurance company,
they
write programs to support customers, transactions, various policy
types offered by the company, billing, accounting, payroll - on and
on. Engineering firms (like it sounds like you work for) have a
much
higher percentage in R&D, often because they are smaller and use a
lot
more packaged solutions to run their businesses, whereas there are
no
packaged solutions to support their engineering needs (i.e. ARM
controllers, etc.).
In my last job, when we had something to release, we usually
talked
directly with the people who had then to deploy it, to explain
them
some
requirements and consequences, that were not directly our
programmer's
job. Indeed, I was not employed by microsoft, google or IBM,
but
very
far from that, we were less than 10 dev.
But now, are most programmers paid by societies with hundreds
of
programmers?
In the jobs I've had, the programmers have never had to talk to
the
deployers. Both were given the design details; the programmers
wrote
to the design and the deployers generated the necessary scripts
do
deploy what the design indicated. When the programmers were
done,
the
deployers were ready to install.
Maybe you worked only in big structures, or maybe this one was
doing
things wrong. But the IT team was quite small, if we only
consider
sysadmins, dev, project leads and few other roles. Less than 15
persons.
Even 15 person teams can do it right. Unfortunately, too many
companies (both big and small) won't do it right. A major reason
why
there are so many bugs out there. Also a major reason why
projects go
over time and over budget.
Sorry, but I do not see what's wrong with having communication
between
various actors of a project.
I have followed various classes of various good ways to manage
projects,
stuff about ITIL for example, and I can not remember having learn
that
part of the process was wrong in my last job. Other were, and not
only a
little, but I do not think having learn that something was wrong
in that
part. But that was only lessons, and I never read the whole ITIL
stuff.
There's no problem with having communications with small groups of
programmers. But when you have > 100 programmers working on a
project, you can't have all of them communicating with each other -
nothing would get done. So you break them up into small teams and
assign each group a piece of the program. Those people talk
together,
and the team
leaders talk together. It is much more manageable.
But "doing it right" has very little to do with team size. Rather,
it has to do with management.
<snip>
To be fair, it wasn't all the manager's fault. He was under a lot
of
pressure from above to do better. But never having been a
programmer
himself, he had never seen how a well-oiled operation worked. But
he
never listened to the couple of guys on his team who tried to
convince
him of how it should run before I was brought in.
I did not thought any second that it was only his fault. But it is a
common problem of what I can see even with my young age: some people
thinks they know what others should do even if they do not know what
they need.
Computer science is a young science, it will take time to people to
admit that softwares are not just wind. Giving people enough time to
do
their job seems does not seems to be the rule those days. And not
only
in computing.
Computer Science is not young any longer. It was young when I
started around 45 years ago; it was maturing 30 years ago. But I
would never call it "young" any more.
I could go into a lot about what's going on, but we've strayed way
off topic here. I think this is a good time to stop and I won't
respond any longer.
Jerry
Agree.
Reply to: