Reverse and Alphabetize in under 2kb

C

Chad

Here is the job interview question that I drew a blank on.

Make a C program that:

Accepts a string input
Displays the original string
Reverses the order of words in the string ("This is a string" =
"string a is This")
Alphabetizes the letters in each word individually ("hisT is a
ginrst")

It all has to be under 2kb and I can't use strtok().
 
E

Eric Sosman

Here is the job interview question that I drew a blank on.

Make a C program that:

Accepts a string input
Displays the original string
Reverses the order of words in the string ("This is a string" =
"string a is This")
Alphabetizes the letters in each word individually ("hisT is a
ginrst")

It all has to be under 2kb and I can't use strtok().

Seems simple enough except for the "under 2kb" part.
What's that supposed to mean? Less than 2KB of source code?
All other "2KB" measures I can think of (code size, data
size, stack size, heap size, ...) would be entirely platform-
dependent and not directly controllable by the way you use
the C language. Even HelloWorld probably takes more than
2KB of executable code on most implementations.
 
C

Chad

     Seems simple enough except for the "under 2kb" part.
What's that supposed to mean?  Less than 2KB of source code?
All other "2KB" measures I can think of (code size, data
size, stack size, heap size, ...) would be entirely platform-
dependent and not directly controllable by the way you use
the C language.  Even HelloWorld probably takes more than
2KB of executable code on most implementations.

--

Yeah, the 2kb part is what confused me. I was just assuming they meant
the size of the executable file.
 
B

bartc

Chad said:
Here is the job interview question that I drew a blank on.

Make a C program that:

Accepts a string input
Displays the original string
Reverses the order of words in the string ("This is a string" =
"string a is This")
Alphabetizes the letters in each word individually ("hisT is a
ginrst")

What do you mean by alphabetize? And what happened to reversal, or did you
want this separately from the alphabetizing?

I got:

"ginrst a is This"

by reversing the words and sorting the characters in each word (but all
upper case come before all lower case).
 
C

Chad

What do you mean by alphabetize? And what happened to reversal, or did you
want this separately from the alphabetizing?

I got:

"ginrst a is This"

by reversing the words and sorting the characters in each word (but all
upper case come before all lower case).

I figured they wanted reversal and alphabetizing done separately.
 
R

rudolf

Chad said:
Yeah, the 2kb part is what confused me. I was just assuming they meant
the size of the executable file.

You know that you can ask the interviewer questions, right? If you
don't understand what they are asking of you, ask them them to clarify
the question.
 
M

Michael Foukarakis

Here is the job interview question that I drew a blank on.

Make a C program that:

Accepts a string input
Displays the original string
Reverses the order of words in the string ("This is a string" =
"string a is This")

Is the output from this stage needed to be made visible?
Alphabetizes the letters in each word individually ("hisT is a
ginrst")

I assume, from your parenthesized example, this is independent
(conceptually) from stage 3?
 
W

Willem

Chad wrote:
) Here is the job interview question that I drew a blank on.
)
) Make a C program that:
)
) Accepts a string input
) Displays the original string
) Reverses the order of words in the string ("This is a string" =
) "string a is This")
) Alphabetizes the letters in each word individually ("hisT is a
) ginrst")
)
) It all has to be under 2kb and I can't use strtok().

- How is the string delivered ?
- Can you use strpbrk, strcspn, strchr or whatever ?
- What exactly is, and isn't, a word ?
- Is it only required that the results are printed, or do they also
have to be present as actual strings somewhere during execution ?
- Does 'alphabetize' mean case-insensitive ?
- Can you use the character values to sort on, other than case ?
- What exactly has to be 'under 2kb' ?

Assumed answers:
- On the commandline, as a single argument (so it's argv[1])
- Yes
- A word is anyting that is not a ' ' character
- They only need to be printed
- Yes, case-insensitive
- Yes you can (works both in ASCII and EBCDIC)
- The source code, after whitespace is removed where possible

Although a lot of those are pretty arbitrary.


SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT
 
W

Willem

Richard Heathfield wrote:
) pete wrote:
)> Eric Sosman wrote:
)>> On 3/2/2010 6:13 PM, Chad wrote:
)>>
)>>> Here is the job interview question that I drew a blank on.
)>>>
)>>> Make a C program that:
)>
)>>> It all has to be under 2kb and I can't use strtok().
)>>
)>>
)>> Seems simple enough except for the "under 2kb" part.
)>> What's that supposed to mean?
)>
)> The assignment as stated, is to make a C program.
)>
)> A C program is composed of text files.
)
) The problem with making code size the only priority is that you end up
) with something like this:

<snip excellent example of compressed code>

Well, actually 2Kb is a pretty reasonable size for this exercise.
Even if you use sensible variable and function names, and straightforward
algorithms, you're unlikely to breach that requirement. My first try
(for both exercises) reached about 600 bytes, which is easily compressed
down to under 500:

#include <stdio.h>
#include <string.h>
#include <ctype.h>
void r(char *s)
{
if(!*s)return;
size_t l=strcspn(s," \t\n");
if(s[l]){
r(s+l+1);
putchar(s[l]);
}
fwrite(s,l,1,stdout);
}
void a(char *s)
{
while(*s){
size_t i,l=strcspn(s," \t\n");
int c;
for(c='a';c<='z';c++)
for(i=0;i<l;i++)
if(tolower(s)==c)
putchar(s);
s+=l;
if(*s)putchar(*s++);
}
}
int main(int c, char *v[])
{
r(v[1]);putchar('\n');
a(v[1]);putchar('\n');
return 0;
}


SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT
 
B

Ben Bacarisse

Richard Heathfield said:
The trick to this question is to realise that the third stage
(reversing the word order) is greatly simplified by the fourth stage
(alphabetising the letters in each word). In stage 3, it is sufficient
to reverse the entire string; you don't need to detect word boundaries
at that point.

That does not match the example though. Applying stage 4 to the
result of stage 3 would yield "ginrst a is hisT". Of course, the
trouble is that the only specified output is the original string, so
your guess about what is expected is a good as mine.
Stage 4 then becomes a simple parse-and-qsort loop.

I assumed the trick was to abstract the idea of "doing something to a
word" so that the strings specified in both stage 3 and, separately,
stage 4 could be produced with minimal code.
 
K

Keith Thompson

Willem said:
Chad wrote:
) Here is the job interview question that I drew a blank on.
)
) Make a C program that:
)
) Accepts a string input
) Displays the original string
) Reverses the order of words in the string ("This is a string" =
) "string a is This")
) Alphabetizes the letters in each word individually ("hisT is a
) ginrst")
)
) It all has to be under 2kb and I can't use strtok().

- How is the string delivered ?
- Can you use strpbrk, strcspn, strchr or whatever ?
- What exactly is, and isn't, a word ?
- Is it only required that the results are printed, or do they also
have to be present as actual strings somewhere during execution ?
- Does 'alphabetize' mean case-insensitive ?
- Can you use the character values to sort on, other than case ?
- What exactly has to be 'under 2kb' ?

Assumed answers:
- On the commandline, as a single argument (so it's argv[1])

The program "Accepts a string input". The word "input" tends to imply
reading from stdin. Also, argv doesn't give you a string; it gives
you a list of strings. You can arbitrarily construct a single string
by, say, joining the arguments with spaces, but that's not specified
in the problem statement.

If argv[1] is "one two" and argv[2] is "three four", is that two words
or four?
- Yes
- A word is anyting that is not a ' ' character

I'm sure you didn't mean to say that "abc" is three words. One
reasonable definition is that a word is a maximal non-empty substring
consisting only of non-whitespace characters, but that needs to be
clarified.
- They only need to be printed
- Yes, case-insensitive

Hmm, I'm not sure I would have assumed case-insensitivity, but the
word "alphabetizes" does tend to imply it. But this should
be clarified.
- Yes you can (works both in ASCII and EBCDIC)
- The source code, after whitespace is removed where possible

The stated requirements don't imply to me that whitespace may be
removed; I would assume it's the full size of the source file.
(Which argues for using a Unix-like system rather than Windows,
so each end-of-line is only 1 byte rather than 2.)
Although a lot of those are pretty arbitrary.

Yup.

Turning unclear requirements into clear requirements is an important
skill for a programmer. A job interview is a good opportunity to
demonstrate this skill. (In one interview, I spent so much time
discussing the requirements for a problem that I never actually
had to solve the problem itself. Yes, I got the job.)
 
W

Willem

Keith Thompson wrote:
)> - Yes
)> - A word is anyting that is not a ' ' character
)
) I'm sure you didn't mean to say that "abc" is three words. One
) reasonable definition is that a word is a maximal non-empty substring
) consisting only of non-whitespace characters, but that needs to be
) clarified.

That's roughly what I meant, yes.

)> - They only need to be printed
)> - Yes, case-insensitive
)
) Hmm, I'm not sure I would have assumed case-insensitivity, but the
) word "alphabetizes" does tend to imply it. But this should
) be clarified.

The example given by the OP also implies case-insensitivity.

)> - Yes you can (works both in ASCII and EBCDIC)
)> - The source code, after whitespace is removed where possible
)
) The stated requirements don't imply to me that whitespace may be
) removed; I would assume it's the full size of the source file.
) (Which argues for using a Unix-like system rather than Windows,
) so each end-of-line is only 1 byte rather than 2.)

Then you could make a 'better' program by removing the whitespace yourself.
Note I said 'where possible', meaning to imply that the result should still
compile and work.

As it turns out, it can be done quite easily in 1Kb, though.


SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT
 
P

Peter Nilsson

Willem said:
As it turns out, it can be done quite easily in 1Kb, though.

Indeed.

% type obfuscate.c
#include <ctype.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define q1(q2,q3) (((q3)<(q2))-((q2)<(q3)))
int q4(const void*q5,const void*q6){const char*q7="abcdefgh"
"ijklmnopqrstuvwxyz";const unsigned char*q8=q5;const unsigned
char*r9=q6;unsigned char q10=tolower(*q8);unsigned char q11=
tolower(*r9);const char*q12=strchr(q7,q10);const char*q13=
strchr(q7,q11);if(q12&&q13)return q1(q12-q7,q13-q7);if(q12)
return-1;if(q13)return+1;return q1(q10,q11);}void q14(char*
q15,char*q16){if(!q16)q16=q15+strlen(q15);if(q15!=q16)for(;
q15<--q16;q15++){char q17;q17=*q15;*q15=*q16;*q16=q17;}}int
main(void){static char q18[1024];char*q15,*q16;if(fgets(q18,
sizeof q18,stdin)){char*q19=strchr(q18,'\n');if(q19)*q19=0;
puts(q18);q14(q18,0);for(q15=q18;*q15;q15=q16){q15=q15+strspn
(q15," \t");q16=q15+strcspn(q15," \t");q14(q15,q16);}puts(
q18);for(q15=q18;*q15;q15=q16){q15=q15+strspn(q15," \t");q16
=q15+strcspn(q15," \t");qsort(q15,q16-q15,1,q4);}puts(q18);}
return 0;}

% acc obfuscate.c -o homework.exe

% type input.txt
This is a string

% homework.exe < input.txt
This is a string
string a is This
ginrst a is hisT

% dir obfuscate.c
Volume in drive E is USB DISK
Volume Serial Number is 14CC-F5A9

Directory of E:\C

04/03/2010 12:30 PM 1,000 obfuscate.c
1 File(s) 1,000 bytes
0 Dir(s) 906,080,256 bytes free

%
 
W

Willem

Peter Nilsson wrote:
)<snip>
)> As it turns out, it can be done quite easily in 1Kb, though.
)
) Indeed.
)
) % type obfuscate.c

<snip 1000-byte very obfuscated code>

With reasonable variable and function names, I get 666 bytes:
(Note that the original has tabs for indent, which I substituted
for spaces for usenet purposes.)

#include <stdio.h>
#include <string.h>
#include <ctype.h>

void reverse(char *str)
{
if(!*str) return;
size_t len = strcspn(str," \t\n");
if(str[len]) {
reverse(str+len+1);
putchar(str[len]);
}
fwrite(str, len, 1, stdout);
}

void sortwords(char *str)
{
while(*str) {
size_t i, len = strcspn(str," \t\n");
int c;
for(c = 'a'; c <= 'z'; c++) {
for(i = 0; i < len; i++) {
if(tolower(str) == c) {
putchar(str);
}
}
}
str += len;
if(*str) {
putchar(*str++);
}
}
}

int main()
{
char str[1024];
if (fgets(str, sizeof str, stdin)) {
reverse(str);
putchar('\n');
sortwords(str);
putchar('\n');
}
return 0;
}

Which just goes to show that algorithmic improvements are an
order of magnitude more important than peephole optimizations.


SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,766
Messages
2,569,569
Members
45,043
Latest member
CannalabsCBDReview

Latest Threads

Top