Sunday, February 1, 2026

NoSQL Databases Still Have Risks






NoSQL, which stands for Not Only SQL, is a common term for nonrelational databases. Among popular NoSQL databases you will find the MongoDB, Cassandra, CouchDB, Redis and more. NoSQL databases have become increasingly popular thanks to their benefits in particular use cases, especially in big data and real-time Web usages where performance, scalability and flexibility are key.

Database security has been and will continue to be one of the more critical aspects of application security. Access to the database grants an attacker a dangerous amount of control over the most critical information. Although the number of SQL injection vulnerabilities has been declining since 2008 due to use of secure frameworks and improved awareness, it has remained a high-impact risk.

With the emergence of new databases and query techniques, the old attack methods become obsolete and new ones emerge. For example, most NoSQL databases do not use SQL and instead use the JavaScript Object Notation (JSON) query language and an HTTP API. This makes old techniques like SQL injection obsolete. However, NoSQL definitely does not imply zero risk. In fact, NoSQL databases are vulnerable to injection attacks, cross-site request forgery (CSRF) and other vulnerabilities.

In a paper titled “No SQL, No Injection? Examining NoSQL Security,” in Web 2.0 Security and Privacy conference it has been demonstrated that a number of techniques for injections in different runtimes using MongoDB are possible. Additionally, the paper discusses Web APIs and their risks, such as CSRF, and some deployment recommendations.

Knowing the risks is key for protecting against them. Having automated security testing is also significant for achieving consistent results. Web application scanners, for instance, can use rules for finding vulnerabilities in NoSQL databases to help you protect against the new exploitation techniques.

Tuesday, September 2, 2025

Installing MongoDB on Ubuntu

Run the following command to import the MongoDB public GPG Key:
girdhar@dcsa:~$ sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 7F0CEB10
Create a /etc/apt/sources.list.d/mongodb.list file using the following command.
girdhar@dcsa:~$ echo 'deb http://downloads-distro.mongodb.org/repo/ubuntu-upstart dist 10gen' | sudo tee /etc/apt/sources.list.d/mongodb.list
Now issue the following command to update the repository:
girdhar@dcsa:~$ sudo apt-get update
Now install the MongoDB by using following command:
girdhar@dcsa:~$ sudo apt-get install mongodb-10gen=2.2.3
In the above installation 2.2.3 is currently released mongodb version. Make sure to install latest version always. Now mongodb is installed successfully.

Start MongoDB
girdhar@dcsa:~$ sudo service mongodb start
Stop MongoDB
girdhar@dcsa:~$ sudo service mongodb stop
Restart MongoDB
girdhar@dcsa:~$ sudo service mongodb restart
To use mongodb run the following command
girdhar@dcsa:~$ mongo
This will connect you to running mongod instance.



If It will not run (giving some error in locks), try running following: -
Step 1: Remove lock file.
girdhar@dcsa:~$ sudo rm /var/lib/mongodb/mongod.lock

Step 2: Repair mongodb.
girdhar@dcsa:~$ sudo mongod --repair

Step 3: start mongodb.
girdhar@dcsa:~$ sudo start mongodb
or
girdhar@dcsa:~$ sudo service mongodb start

Step 4: Check status of mongodb.
girdhar@dcsa:~$ sudo status mongodb
or
girdhar@dcsa:~$ sudo service mongodb status

Step 5: Start mongo console.
girdhar@dcsa:~$ mongo

Wednesday, January 8, 2025

Doing a PhD in Computer Science by Pankaj Jalote, Professor, IIT


Article Source: www.cse.iitk.ac.in/acad/doingphd.html


PhD is the highest degree offered anywhere in the world (barring the highly uncommon DSc). Its focus, unlike regular degrees, is not learning existing knowledge but creating new knowledge. No wonder, PhD is desired by anyone wishing to `make a mark' - the brightest seek it as it allows challenges that no other degree does; the innovative desire it as it allows the possibility to innovate and create new knowledge and technologies; the ambitious seek it as it is the top of the academic ladder; and the persistent seek it as this is the one degree where persistence and self discipline are brought to full use.
Despite the romance of doing a PhD, there seems to be some `fear' frequently in the youth about pursuing a PhD - prospective students are sometimes scared because they think it is ``too difficult and they will not be able to do it'', or because ``it takes too much time'', or perhaps just because ``PhD means that you have to be a teacher and I don't want to be a teacher''. All of these reasons/fears are actually false. This note discusses some of these issues in an attempt to clarify the situation for prospective candidates.

What is Involved in a PhD?

First let us clearly understand what is involved in doing a PhD. PhD is, as everyone knows, about doing research. And research is about formulating problems or questions whose answers the research or practitioner community wants to know, and whose answers are not known. Doing research is to provide some answer to these questions. So, the key aspects of doing a PhD are (a) formulating a question or a problem that is of interest and that can be solved, and (b) providing a useful/interesting solution to the stated problem. The results obtained are presented in national/international conferences, and/or submitted to scientific journals.
The problems that are addressed by a PhD scholar can range from very difficult open problems to evolutionary technology issues. In Computer Science, the problem area can range from highly theoretical/mathematical to modeling and experimentation or building new technologies. For example, there are a lot of open problems regarding the complexity of solving some problems algorithmically (for example, checking whether a number is a prime - a problem for which an efficient solution was proposed only recently by Prof. Agarwal of CSE/IITK). These problems typically involve complex theoretical and mathematical development. Similarly, there are many problems that require understanding the behavior of various systems. Approaching such problems frequently uses modeling and experimentation. Then there are problems of the type where something innovative and useful is done using computers and software. Working on such problems typically involves building systems and prototypes. In other words, a scholar doing a PhD in Computer Science has a wide range of areas to choose from, depending on his inclination, ability, and interests.
Generally, PhD programs world over proceed as follows: do some course work, pass some qualifying exam, and then write a thesis that has to be defended (sometimes, at the early stages of the thesis a `proposal' may have to be submitted.) In a place like CSE/IIT Kanpur, generally, a student who joins the PhD program after completing his/her B.Tech/BE will spend about 1 year doing the courses, about 1 to 2 years for formulating the problem, which also requires an in-depth study of the chosen area, and about 2 years or more for developing the solutions and writing the thesis. Once the thesis is written, it is examined by some experts and a thesis defense is scheduled. This process takes about 6 months, but the candidate can start his post PhD job once the thesis is submitted. Hence, doing a PhD takes about as much time as doing a BE, or the amount of time a doctor spends doing his residency and MD.
Doing a PhD is indeed hard. However, the difficulty is not because extreme intelligence is necessary. Brilliance, of course, helps - brilliant people can attack hard problems and produce solid results and leave a permanent mark on the field. However, students with good academic background and some amount of creativity can also do a PhD, and do quite well. Completing a PhD primarily requires a drive, motivation, and hard work. Hard and motivated work determines not only the quality of the final work, but also the amount of time needed to complete the PhD. In general, a PhD can be completed in 4 to 5 years - 4 years for the motivated and 5 for the not-so-motivated.

What are the Options after PhD?

Clearly, one of the main job opportunities after PhD is a faculty position in some University, College, or an Institute. This option, world over, is one of the most preferred options for people with PhD. One of the reasons for this is that in today's world, the best quality people require freedom in their work and have a strong desire to ``make a mark''. Both of these are well supported by research activities a faculty undertakes as part of his job - he has the freedom to select the problems he works on, and through his research he creates new knowledge which is published under his authorship. This is a very strong motivating factor and very good people across the world sacrifice other benefits for academic freedom and possibilities to create and innovate.
In faculty positions it should be mentioned that for Computer Science, as there is a shortage of qualified faculty in most good institutions, there are always openings in places like IITs/NITs/RECs/Central Universities, etc. In addition, now some private universities as well as institutions set up by overseas universities coming up which also are looking for qualified faculty (and these places are paying much more than Govt. scales - up to 3 times more!)
However, teaching is by no means the only option for a PhD. India today is fast becoming a center for global R&D. Due to the quality of its manpower, and the lower cost, many organizations have started setting up R&D centers in India. And these centers are looking for PhDs (and not finding them, as there are not enough getting produced!) Some example of these are IBM India Research Center in New Delhi, GE Research in Bangalore, Mentor Graphics in Hyderabad, Cadence, etc.
Besides these research centers, most of the big Indian IT companies have started high end technology development and consulting, and R&D centers or cells, in which they need PhDs. Almost all the IT majors - TCS, Infosys, Wipro, etc. employ a large number of PhDs (upward of about 50 each) and are always looking for more people to enhance their R&D activities. This activity will increase in these companies as these companies become larger. A few years back, we had done an informal survey on salaries with these companies and we were told that most of them will offer at least twice as much starting salary for a PhD as for a BE. Starting salaries for many of these positions are of the order of Rs 50,000 PM or more.
Besides these opportunities in India, people with PhDs are global citizens and move around quite a bit. Immediately after PhD, there are opportunities for post-doctoral work in US and Europe. While working as a researcher/faculty, there are opportunities for visiting faculty appointments or visiting researchers in US, Europe, Singapore, etc., where a faculty member from India can spend a year or two in overseas universities or research labs. Many faculty members from India avail of this from time to time.

Thursday, May 2, 2024

Machine language

Problem 1. Hand-compile the following C fragments into Beta assembly language. You can assume that the necessary storage allocation for each variable or array has been done and that a UASM label has been defined that indicates the first storage location for that variable or array. All of the variables are stored in main memory (in the first 32k bytes of main memory so that they can be addressed by a 16-bit literal). You can also assume that all variables and arrays are C integers, i.e., 32-bit values.
  1. Explain what Beta assembly language instruction(s) are needed to load the value of a variable that has been allocated in the first 32k bytes of main memory (i.e., at an address less than 0x8000). How would your answer change if the variable was located at address outside this range (e.g., at address 0x12468).
    If the storage for the variable is located at an address less than 0x8000, the 16-bit constant field of the LD instruction can hold the complete address. Note that the 16-bit constant is sign-extended, so our address has to fit in 15 bits. So LD(R31,addr,R0) would load the contents Mem[addr] into R0 assuming addr < 0x8000. For addresses >= 0x8000 the 16-bit constant field isn't large enough to hold the address. In these cases one could use the LDR instruction to load a 32-bit address into a register and the use LD to fetch the data:
    vaddr:
       LONG(0x12468)
    ...
       LDR(vaddr,R0)    ; load address of variable into R0
       LD(R0,0,R0)      ; load Mem[address] into R0
    
  1. a = b + 3*c;

       LD(c,R1)
       SHLC(R1,1,R0)    ; 2*c
       ADD(R0,R1,R0)    ; + c
       LD(b,R1)
       ADD(R1,R0,R0)   
       ST(R0,a)
    
  1. if (a > b) c = 17;
       LD(a,R0)
       LD(b,R1)
       CMPLE(R0,R1,R0)
       BT(R0,_L2)
       CMOVE(17,R0)
       ST(R0,c)
    _L2:
    
  1. if (sxt_short) { b = (b << 16) >> 16; }
       LD(sxt_short,R0)
       BEQ(R0,_L3)
       LD(b,R1)
       SHLC(R1,16,R0)    ; shift so that bit 15 is now bit 31
       SRAC(R0,16,R0)    ; shift back, replicating sign bit
       ST(R0,b)
    
  1. cjt->salary += 3752;
    Assume that the salary component of the structure pointed to by cjt has a byte offset of 8 from the beginning of the structure.
       LD(cjt,R1)
       LD(R1,8,R0)
       ADDC(R0,3752,R0)
       ST(R0,8,R1)
    
  1. a[i] = a[i-1];
       LD(i,R0)
       SHLC(R0,2,R0)
       LD(R0,a-4,R1)
       ST(R1,a,R0)
    
  1. sum = 0;
    for (i = 0; i < 10; i = i+1) sum += i;
       ST(R31,sum)
       ST(R31,i)
    _L7:
       LD(sum,R0)
       LD(i,R1)
       ADD(R0,R1,R0)
       ST(R0,sum)
       ADDC(R1,1,R1)
       ST(R1,i)
       CMPLTC(R1,10,R0)
       BT(R0,_L7)
    


Problem 2. In block structured languages such as C or Java, the scope of a variable declared locally within a block extends only over that block, i.e., the value of the local variable cannot be accessed outside the block. Conceptually, storage is allocated for the variable when the block is entered and deallocated when the block is exited. In many cases, this means the compiler if free to use a register to hold the value of the local variable instead of a memory location.
Consider the following C fragment:
int sum = 0;
{ int i;
  for (i = 0; i < 10; i = i+1) sum += i;
}
  1. Hand-compile this loop into assembly language, using registers to hold the values of the local variables "i" and "sum".
       MOVE(R31,R2)      ; R2 holds sum
       ST(R2,sum)
       MOVE(R31,R1)      ; R1 holds i
    _L5:
       ADD(R2,R1,R2)
       ADDC(R1,1,R1)
       CMPLTC(R1,10,R0)
       BT(R0,_L5)
       ST(R2,sum)
    
  1. Define a memory access as any access to memory, i.e., instruction fetch, data read (LD), or data write (ST). Compare the number of total number of memory accesses generated by executing the optimized loop with the total number of memory access for the unoptimized loop (part G of the preceding problem).
    The unoptimized code has an 8 instruction loop that makes 4 data accesses; 10 loop iterations => 120 memory accesses. There are 4 additional memory accesses to initialize sum and i. Total = 124.The optimized code has a 4 instruction loop that makes 0 data accesses; 10 loop iterations => 40 memory accesses. There are 6 additional memory accesses to initializes sum and i, and to store sum at the end of the loop. Total = 46.
  1. Some optimizing compilers "unroll" small loops to amortize the overhead of each loop iteration over more instructions in the body of the loop. For example, one unrolling of the loop above would be equivalent to rewriting the program as
    int sum = 0;
    { int i;
      for (i = 0; i < 10; i = i+2) { sum += i; sum += i+1; }
    }
    
    Hand-compile this loop into Beta assembly language and compare the total number of memory accesses generated when it executes to the total number of memory accesses from part (1).
       MOVE(R31,R2)      ; R2 holds sum
       ST(R2,sum)
       MOVE(R31,R1)      ; R1 holds i
    _L5:
       ADD(R2,R1,R2)
       ADDC(R1,1,R0)
       ADD(R2,R0,R2)
       ADDC(R1,2,R1)
       CMPLTC(R1,10,R0)
       BT(R0,_L5)
       ST(R2,sum)
    
    This code has a 6 instruction loop that makes 0 data accesses; 5 loop iterations => 30 memory accesses. There are 6 additional memory accesses to initializes sum and i, and to store sum at the end of the loop. Total = 36.


Problem 3.
  1. Hand-assemble the following Beta assembly language program:
            I = 0x5678
            B = 0x1234
    
            LD(I,R0)
            SHLC(R0,2,R0)
            LD(R0,B,R1)
            MULC(R1,17,R1)
            ST(R1,B,R0)
            I = 0x5678
            B = 0x1234
    
            LD(R31,I,R0)        011000 00000 11111 0101 0110 0111 1000 = 0x601F5678
            SHLC(R0,2,R0)       111100 00000 00000 0000 0000 0000 0010 = 0xF0000002
            LD(R0,B,R1)         011000 00001 00000 0001 0010 0011 0100 = 0x60201234
            MULC(R1,17,R1)      110010 00001 00001 0000 0000 0001 0001 = 0xA8210011
            ST(R1,B,R0)         011001 00001 00000 0001 0010 0011 0100 = 0x64201234
    
  1. What C statement might have been compiled into the code fragment above?
    B[I] = B[I] * 17;


Problem 4. Hand-assemble the following Beta branch instructions into their binary representation:
  1. foo: BR(foo) [recall that BR(label) = BEQ(R31,label,R31)]
        BEQ    R31   R31   offset = -1
        011101 11111 11111 1111 1111 1111 1111
    
  1. BR(bar)
    bar:
        BEQ    R31   R31   offset = 0
        011101 11111 11111 0000 0000 0000 0000
    
  1. foo = 0x100
    . = 0x1000
    BF(R17,foo,R31)
        BEQ    R31   R17   offset = (0x100 - 0x1004)/4 = 0xFC3F
        011101 11111 10001 1111 1100 0011 1111
    
  1. Explain why PC-relative branch addressing is a good choice for computers like the Beta that can encode only a "small" constant in each instruction.
    Branches are used to implement conditional and looping constructs (e.g., ifwhilefor). So most branch targets are just a few instructions away. With PC-relative addressing, we can reach targets 32768 instructions before or 32767 instructions after the branch, independent of the actual absolute address of the branch. So used as an offset, the 16-bit constant can accommodate most branch targets even for very large programs. Used as an absolute address, branch targets would be constrained to be in the first 32K of memory.
  1. Suppose a different computer could encode an arbitrary 32-bit constant in an instruction (using, e.g., a variable-length instruction encoding). Would PC-relative addressing still make sense? Why?
    Even if a complete absolute address could be encoded in an instruction, PC-relative address might be much more compact since most branch targets are nearby, i.e., we could get by with a 16-bit offset instead of a 32-bit absolute address.


Problem 5.
  1. True or false: The Beta SUBC opcode could be eliminated since every SUBC instruction can be replaced an equivalent ADDC instruction.
    False: SUBC(Rx,0x8000,Rx) subtracts -32768 from Rx. The ADDC equivalent would add 32768 to Rx, but we can't express that constant in the signed, 16-bit constant field provided in the Beta instruction format.
  1. What is the binary representation for the Beta instruction SUBC(R17,12,R22)?
     SUBC   R22   R17          12
    110001 10110 10001 0000 0000 0000 1100 = 0xC6D1000C
    
  1. A certain TA wants to know what would happen if the Beta as implemented in the lab executed 0xEDEDEDED as an instruction. What does happen?
    0xEDEDEDED = 111011 01111 01101 1110 1101 1110 1101
    The opcode field correspsonds to an illegal instruction opcode which causes the beta to take a trap (saving the PC+4 of the offending instruction in the XP register) and set the PC to ILLOP.
  1. Suppose that the Beta instruction BR(error) were assembled into memory location 0x87654. Assuming that the instruction works as intended (i.e., when executed, control is transferred to the first instruction in the error routine), which of the following is the best statement about the possible values for the symbol "error"?
    1. it depends on the first instruction in the error routine.
    2. it can have any 32-bit value
    3. it can have any 32-bit value that is a multiple of 4
    4. it is a multiple of 4 in the range 0x7F658 to 0x8F654 inclusive.
    5. it is a multiple of 4 in the range 0x67658 to 0xA7654 inclusive.
    6. none of the above
    (E): A branch instruction in which the branch is taken will multiply the sign-extended 16-bit literal field by 4 and add it to PC+4.
    if literal = 0x8000 then
      new PC = 0x87654 + 4 - (8000 * 4) = 0x67658
    if literal = 0x7FFF then
      new PC = 0x87654 + 4 + (7FFF * 4) = 0xA7654
    


Problem 6. The Meta is a processor similar to the Beta, except that the data paths have been modified to accommodate the addition of a new Subtract One and Branch instruction:
  Usage: SOB(Ra,label,Rc)
  Operation:
    literal = ((OFFSET(label) - OFFSET(current inst))/4) - 1
    PC = PC + 4
    EA = PC + 4*SEXT(literal)
    Reg[Rc] = Reg[Ra] - 1
    if (Reg[Ra]- 1) != 0 then PC = EA
As with branches in the Beta, the binary encoding of the SOB instruction places the low-order 16 bits of the "literal" value in the low-order 16 bits of the instruction. The designers of the Meta implementation have used the Meta's ALU to perform the subtraction.
  1. Suppose R1 contains the value 1. How will executing SOB(R1,label,R31) change register R1 and the PC?
    R1 is unchanged since the destination register (Rc) of the example SOB instruction is R31. Reg[R1]-1 = 0, so the branch is not taken and so the PC will point to the instruction following the SOB.
  1. Consider the following instruction sequence:
    loop: ADD(R1,R2,R3)
          SOB(R4,loop,R4)
    
    Assuming the ADD instruction is placed in location 0x108 of memory, what are the contents of the low-order 16 bits of the SOB instruction?
    Actually we don't need to know the address of the ADD instruction to answer the question since the SOB instruction (like all Beta branches) uses PC-relative addressing. Remembering that the branch offset is computed from the PC of the instruction following the SOB, the correct contents of the offset field is -2 = 0xFFFE.
  1. A schematic for the adder circuitry in the ALU of the Meta is shown below:
    What would be the correct values for OP[2:0] in order to perform a subtract (i.e., SUM = A - B)?
    SUM = A - B = A + (~B + 1). Setting OP2 = 1 and OP1 = 0 selects ~B as the XB input to the 32-bit add, setting OP0 = 1 asserts the carry in for the low-order bit of the 32-bit add and hence provides the required "+1". So OP[2:0] = 0b101.
  1. What would be the correct values for OP[2:0] in order to perform the decrement needed for the SOB instruction (i.e., SUM = A - 1)?
    If OP[2:0] = 0b110, the XB input is set to (B or ~B) = all ones, the two's complement representation for -1. Carry-in should be set to 0.
  1. Is it possible to use the logic above to do an increment (i.e., SUM = A+1)?
    Yes, OP[2:0] = 0b001, setting XB to 0 and the carry-in to 1.


Problem 7. A local junk yard offers older CPUs with non-Beta architectures that require several clocks to execute each instruction. Here are the specifications:
ModelClock RateAvg. clocks/Inst.
x40 Mhz2.0
y100 Mhz10.0
z60 Mhz3.0
You are going to choose the machine which will execute your benchmark program the fastest, so you compiled and ran the benchmark on the three machines and counted the total instructions executed:
    x: 3,600,000 instructions executed
    y: 1,900,000 instructions executed
    z: 4,200,000 instructions executed
  1. Based on the above data which machine would you choose?
    Total execution time:

    x: (3,600,000 insts)(2 clocks/inst)(25 ns/inst) = 0.18 seconds
    y: (1,900,000 insts)(10 clocks/inst)(10 ns/inst) = 0.19 seconds
    z: (4,200,000 insts)(3 clocks/inst)(16.67 ns/inst) = 0.21 seconds
    X ran the benchmark the fastest.


Problem 8. Kerry DeWay is proposing to add a "Load Constant" instruction LDC(const,Rx) to the Beta instruction set. LDC loads the 32-bit constant const in register Rx. She can't convince the hardware team to implement LDC directly and consequently plans to define it as a macro. She is considering the following alternative implementations:
[1] .macro LDC(const,Rx) {
       LD(.+8,Rx)
       BR(.+8)
       LONG(const)
    }

[2] .macro LDC(const,Rx) {
       PUSH(R17)
       BR(.+8,R17)
       LONG(const)
       LD(R17,0,Rx)
       POP(R17)
    }

[3] .macro LDC(const,Rx) {
       ADDC(R31,const >> 16,Rx)
       SHLC(Rx,16,Rx)
       ADDC(Rx,const & 0xFFFF,Rx)
    }
Kerry tries each definition on a few test cases and convinces herself each works fine. The Quality Assurance team isn't so sure and complains that Kerry's LDC implementations don't all work for every choice of register (Rx), every choice of constant (const), and every choice of code location.
  1. Evaluate each approach and decide whether it works under all circumstances or if it fails, indicate that it misbehaves for certain choices of Rx, const or code location.
    [1] fails if the code is located so that the LD instruction is at, e.g., address 0x7FFC since we can't represent .+8 = 0x8004 in the 16-bit literal field of the LD instruction.[2] fails for LDC(const,R17) since the POP(R17) at the end of the macro restores the old value of R17, wiping out the constant we just loaded.[3] fails for any const which has bit 15 set (e.g., 0x8000) since the final ADDC will sign-extended its literal field, adding 0xFFFF to the high half of Rx.


Problem 9. Which of the following Beta instruction sequences might have resulted from compiling the following C statement?
int x[20], y;
y = x[1] + 4;
  1. LD (R31, x + 1, R0)
    ADDC (R0, 4, R0)
    ST (R0, y, R31)
    Not this one. If x[0] is stored at location x, x[1] is stored at location x + 4 since x[] is an integer array and each integer takes one word (4 bytes).
  1. CMOVE (4, R0)
    ADDC (R0, x + 4, R0)
    ST (R0, y, R31)
    Not this one. The second instructions adds the address of x[1] to R0, not the contents of x[1].
  1. LD (R31, x + 4, R0)
    ST (R0, y + 4, R31)
    Not this one. This stores x[1] in the location following the one word of storage allocated for y.
  1. CMOVE (4, R0)
    LD (R0, x, R1)
    ST (R1, y, R0)
    Not this one. This implements y[1] = x[1].
  1. LD (R31, x + 4, R0)
    ADDC (R0, 4, R0)
    ST (R0, y, R31)
    Yes!
  1. ADDC (R31, x + 1, R0)
    ADDC (R0, 4, R0)
    ST (R0, y, R31)
    Not this one. The ADDC instruction loads the address of x plus 1 into R0.


Problem 10. An unnamed associate of yours has broken into the computer (a Beta of course!) that 6.004 uses for course administration. He has managed to grab the contents of the memory locations he believes holds the Beta code responsible for checking access passwords and would like you to help discover how the password code works. The memory contents are shown in the table below:
Address Contents (in hexadecimal)
0x100 0xC05F0008
0x104 0xC03F0000
0x108 0xE060000F
0x10C 0xF0210004
0x110 0xA4230800
0x114 0xF4000004
0x118 0xC4420001
0x11C 0x77E20002
0x120 0x77FFFFF9
0x124 0xA4230800
0x128 0x605F0124
0x12C 0x90211000
  1. Reconstruct the Beta assembly code that corresponds to the binary instruction encoding shown above. If the code sequence contains branches, be sure to indicate the destination of each branch.
    Address Contents        Opcode  Rc      Ra      Rb      Assembly
    0x100 0xC05F0008 110000 00010  11111   ADDC(R31, 0x8, R2)
    0x104 0xC03F0000 110000 00001  11111  ADDC(R31, 0x0, R1)
    0x108 0xE060000F 111000 00011  00000  ANDC(R0, 0xF, R3)
    0x10C 0xF0210004 111100 00001  00001  SHLC(R1, 0x4, R1)
    0x110 0xA4230800 101001 00001 00011 00001 OR(R3, R1, R1)
    0x114 0xF4000004 111101 00000  00000  SHRC(R0, 0x4, R0)
    0x118 0xC4420001 110001 00010  00010  SUBC(R2, 0x1, R2)
    0x11C 0x77E20002 011101 11111  00010  BEQ(R2,0x128) *
    0x120 0x77FFFFF9 011101 11111  11111  BEQ(R31,0x108) ** 
    0x124 0xA4230800 101001 00001  00011  not an opcode
    0x128 0x605F0124 011000 00010  11111  LD(0x0124,R2)
    0x12C 0x90211000 100100 00001  00001 00010 CMPEQ(R1,R2,R1)
    
    * The literal in instruction 0x11c is 0x2, so the corresponding label in
      Beta assembly is
     PC + 4 + 4*literal = 0x11c + 4 + 4*2 = 0x128
    ** In instruction 0x120, SEXT(literal) = -7, so the corresponding label
       in Beta assembly is
     PC + 4 + 4*literal = 0x120 + 4 + 4*(-7) = 0x124 - 0x01C = 0x108
    
  1. Further investigation reveals that the password is just a 32-bit integer which is in R0 when the code above is executed and that the system will grant access if R1 = 1 after the code has been executed. What "passnumber" will gain entry to the system?
    Let's analyze this assembly by translating it to pseudo-code:
     R2 = 8; /* R2 is used as a counter */
     R1 = 0;
    
    loop: R3 = R0 & 0xF; /* R3 stores the current low nibble of R0 */
     R1 = R1 << 4;
     R1 = R3 | R1;
     R0 = R0 >> 4;
     R2 = R2 - 1;
     if R2 == 0 goto done;
     goto loop;
    
    data: 0xA4230800
    
    done: LD(data,R2);
     if (R1 == R2)
      R1=1;
     else
      R1=0;
    
    We can see that the code shifts R1 left by a nibble (4 bits) and ors it with the low nibble (R3) of the user's entered password (R0). It then shifts the user's password right by a nibble and loops back to the beginning. It does this a total of 8 times. The net effect is to reverse the order of the nibbles in R0 and to store this into R1. The result is then compared to 0xA4230800. Therefore, in order for the entered password to be accepted, it must be the nibble-reversed version of 0xA4230800.
    Thus, the "passnumber" required to enter is 0x0080324A

Monday, April 29, 2024

STACK Memory

Let's look at the following example and see how the local variables are set in the stack area:
void f1()
{
 int a;
 short b[4];
 double c;
 f2();
 f3();
} 
When we call f1() function, space needs to be set aside. The following picture shows activation records/stack frame:
locals 
When the function f1() is called, the stack pointer will be decremented by 20 bytes which is the size of the variables of f1().
stack_pointer 
void f2()
{
 int x;
 char *y;
 char *z[2];
 f3();
}

void f3()
{
 double m[3];
 int n;
}
The following diagram shows the movement of stack pointer, sp: f1()->f1.f2()->f2.f3()->f2.f3() exits->f1.f2() exits.
stack_pointer_f1_f2_f3
After the f1.f2(), when we call f1.f3(), the variables of the function f3() simply overwrites them onto the area where the variables of f2() were.