Free Essay

Software Quality Assurance

In:

Submitted By iqbal369
Words 10932
Pages 44
Leveraging Existing Tests in Automated Test Generation for Web Applications
Amin Milani Fard Mehdi Mirzaaghaei
University of British Columbia Vancouver, BC, Canada

Ali Mesbah

{aminmf, mehdi, amesbah}@ece.ubc.ca ABSTRACT
To test web applications, developers currently write test cases in frameworks such as Selenium. On the other hand, most web test generation techniques rely on a crawler to explore the dynamic states of the application. The first approach requires much manual effort, but benefits from the domain knowledge of the developer writing the test cases. The second one is automated and systematic, but lacks the domain knowledge required to be as effective. We believe combining the two can be advantageous. In this paper, we propose to (1) mine the human knowledge present in the form of input values, event sequences, and assertions, in the human-written test suites, (2) combine that inferred knowledge with the power of automated crawling, and (3) extend the test suite for uncovered/unchecked portions of the web application under test. Our approach is implemented in a tool called Testilizer. An evaluation of our approach indicates that Testilizer (1) outperforms a random test generator, and (2) on average, can generate test suites with improvements of up to 150% in fault detection rate and up to 30% in code coverage, compared to the original test suite. these interactions at runtime is manifested through the Document Object Model (DOM) and presented to the end-user in the browser. To avoid dealing with all these complex interactions separately, many developers treat the web application as a black-box and test it via its manifested DOM, using testing frameworks such as Selenium [6]. These DOMbased test cases are written manually, which is a tedious process with an incomplete result. On the other hand, many automated testing techniques [13, 19, 28, 31] are based on crawling to explore the state space of the application. Although crawling-based techniques automate the testing to a great extent, they are limited in three areas: Input values: Having valid input values is crucial for proper coverage of the state space of the application. Generating these input values automatically is challenging since many web applications require a specific type, value, and combination of inputs to expose the hidden states behind input fields and forms. Paths to explore: Industrial web applications have a huge state space. Covering the whole space is infeasible in practice. To avoid unbounded exploration, which could result in state explosion, users define constraints on the depth of the path, exploration time or number of states. Not knowing which paths are important to explore results in obtaining a partial coverage of a specific region of the application. Assertions: Any generated test case needs to assert the application behaviour. However, generating proper assertions automatically without human knowledge is known to be challenging. As a result, many web testing techniques rely on generic invariants [19] or standard validators [11] to avoid this problem. These two approaches work at the two extreme ends of the spectrum, namely, fully manual or fully automatic. We believe combining the two can be advantageous. In particular, humans may have the domain knowledge to see which interactions are more likely or important to cover than others; they may be able to use domain knowledge to enter valid data into forms; and, they might know what elements on the page need to be asserted and how. This knowledge is typically manifested in manually-written test cases. In this paper, we propose to (1) mine the human knowledge existing in manually-written test cases, (2) combine that inferred knowledge with the power of automated crawling, and (3) extend the test suite for uncovered/unchecked portions of the web application under test. We present our technique and tool called Testilizer, which given a set of Selenium test cases T C and the URL of the application, automatically infers a model from T C, feeds that model to a

Categories and Subject Descriptors
D.2.5 [Software Engineering]: Testing and Debugging

General Terms
Verification, Algorithms, Experimentation

Keywords
Automated test generation; test reuse; web applications

1. INTRODUCTION
Web applications have become one of the fastest growing types of software systems today. Testing modern web applications is challenging since multiple languages, such as HTML, JavaScript, CSS, and server-side code, interact with each other to create the application. The final result of all
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Permissions@acm.org. ASE’14, September 15 – 19, 2014, Vasteras, Sweden. Copyright 2014 ACM 978-1-4503-3013-8/14/09 ...$15.00. http://dx.doi.org/10.1145/2642937.2642991.

crawler to expand by exploring uncovered paths and states, generates assertions for newly detected states based on the patterns learned from T C, and finally generates new test cases. To the best of our knowledge, this work is the first to propose an approach for extending a web application test suite by leveraging existing test cases. The main contributions of our work include: • A novel technique to address limitations of automated test generation techniques by leveraging human knowledge from existing test cases. • An algorithm for mining existing test cases to infer a model that includes (1) input data, (2) event sequences, (3) and assertions, and feeding and expanding that model through automated crawling. • An algorithm for reusing human-written assertions in existing test cases by exact/partial assertion matching as well as through a learning-based mechanism for finding similar assertions. • An implementation of our technique in an open source tool, called Testilizer [7]. • An empirical evaluation of the efficacy of the generated test cases on four web applications. On average, Testilizer can generate test suites with improvements of up to 150% on the fault detection rate and up to 30% on the code coverage, compared to the original test suite.

Figure 1: A snapshot of the running example and its partial DOM structure.
1 2 3 4 5 6 7 8 9 10

2. BACKGROUND AND MOTIVATION
In practice, web applications are largely tested through their DOM using frameworks such as Selenium. The DOM is a dynamic tree-like structure representing user interface elements in the web application, which can be dynamically updated through client-side JavaScript interactions or serverside state changes propagated to the client-side. DOM-based testing aims at bringing the application to a particular DOM state through a sequence of actions, such as filling a form and clicking on an element, and subsequently verifying the existence or properties (e.g., text, visibility, structure) of particular DOM elements in that state. Figure 1 depicts a snapshot of a web application and Figure 2 shows a simple DOM-based (Selenium) test case for that application. For this paper, a DOM state is formally defined as: Definition 1 (DOM State). A DOM State DS is a rooted, directed, labeled tree. It is denoted by a 5-tuple, < D, Q, o, Ω, δ >, where D is the set of vertices, Q is the set of directed edges, o ∈ D is the root vertex, Ω is a finite set of labels and δ : D → Ω is a labelling function that assigns a label from Ω to each vertex in D. 2 The DOM state is essentially an abstracted version of the DOM tree of a web application, displayed on the web browser at runtime. This abstraction is conducted through the labelling function δ, the implementation of which is discussed in subsection 3.1 and section 4. Motivation. Overall, our work is motivated by the fact that a human-written test suite is a valuable source of domain knowledge, which can be exploited for tackling some of the challenges in automated web application test generation. Another motivation behind our work is that manually written test cases typically correspond to the most common happy-paths of the application that are covered. Automated analysis can subsequently expand these to cover unexplored bad-weather application behaviour.

11 12 13 14

@Test public void testAddNote () { get ( " http :// localhost :8080/ theorganizer / " ) ; findElement ( By . id ( " logon_username " ) ) . sendKeys ( "← user " ) ; findElement ( By . id ( " logon_password " ) ) . sendKeys ( "← pswd " ) ; findElement ( By . cssSelector ( " input type = " image " " ) ) .← click () ; assertEquals ( " Welcome to The Organizer ! " , ← c l o s e A l e r t A n d G e t I t s T e x t () ) ; findElement ( By . id ( " newNote " ) ) . click () ; findElement ( By . id ( " n o t e C r e a t e S h o w _ s u b j e c t " ) ) .← sendKeys ( " Running Example " ) ; findElement ( By . id ( " n o t e C r e a t e S h o w _ t e x t " ) ) . sendKeys← ( " Create a simple running example " ) ; findElement ( By . cssSelector ( " input type = " image " " ) ) .← click () ; assertEquals ( " Note has been created . " , driver .← findElement ( By . id ( " mainContent " ) ) . getText () ) ; findElement ( By . id ( " logoff " ) ) . click () ; }

Figure 2: A human-written DOM-based (Selenium) test case for the Organizer.

Running example. Figure 1 depicts a snapshot of the Organizer [4], a web application for managing notes, contacts, tasks, and appointments, which we use as a running example to show how input data, event paths, and assertions can be leveraged from the existing test cases to generate effective test cases. Suppose we have a small test suite that verifies the application’s functionality for “adding a new note” and “adding a new contact”. Due to space constraints, we only show the testAddNote test case in Figure 2. The test case contains valuable information regarding how to log onto the Organizer (Lines 4–5), what data to insert (Lines 9–10), where to click (Lines 6, 8, 11, 13), and what to assert (Lines 7, 12). We believe this information can be extracted and leveraged in automated test generation. For example, the paths (i.e., sequence of actions) corresponding to these covered functionalities can be used to create an abstract model of the application, shown in thick solid lines in Figure 3. By feeding this model that contains the event sequences and input data leveraged from the test case to a crawler, we can explore alternative paths for testing, shown as thin lines in Figure 3; alternative paths for deleting/updating a note/contact that result in newly detected states (i.e., s10 and s11) are highlighted as dashed lines. Further, the assertions in the test case can be used as guidelines for generating new assertions on the newly de-

logoff dayAtAGlance dayAtAGlance dayAtAGlance notes logoff

s10 ok edit notes delete update save

logoff s4 contacts contacts logoff notes contacts ok logoff logoff logoff

s5

logOn Index createAccount createAccount

notes s1 dayAtAGlance s2

newNote

s3 notes contacts

s9 dayAtAGlance s6 notes contacts

edit newContact contacts s11

delete update save s7

logoff s8 logoff contacts dayAtAGlance dayAtAGlance dayAtAGlance

Figure 3: Partial view of the running example application’s state-flow graph.

tected states along the alternative paths. These original assertions can be seen as parallel lines inside the nodes on the graph of Figure 3. For instance, line 12 of Figure 2 verifies the existence of the text “Note has been created” for an element (span) with id="mainContent", which can be assigned to the DOM state s4 in Figure 3. By exploring alternative paths around existing paths and learning assertions from existing assertions, new test cases can be generated. For example the events corresponding to states Index, s1, s2, s10, s4, s5 can be turned into a new test method testUpdateNote(), which on state s4, verifies the existence of a element with id="mainContent". Further, patterns found in existing assertions can guide us to generate similar assertions for newly detected states (e.g., s9, s10, s11) that have no assertions.

Human-Written Test Suite

Browser

Generated Test Suite

(1) Instrument and Execute Test Suite

(2) Execute Test Operations

(3) Analyze DOM update

(4) Explore Alternative Paths

(6) Generate Test Suite

Test Operations Dataset

add assertions

State-flow graph

(5) Regenerate Assertions

Figure 4: Processing view of our approach. Definition 3 (Manual-test State). A manual-test state is a DOM state located on a manual-test path. 2 The instrumentation hooks into any code that interacts with the DOM in any part of the test case, such as test setup, helper methods, and assertions. Note that this instrumentation does not affect the functionality of the test cases (more details in Section 4). By executing the instrumented test suite, we store all observed manual-test paths as an intermediate dataset of test operations: Definition 4 (Test Operation). A test operation is a triple , where action specifies an event-based action (e.g., a click), or an assertion (e.g., verifying a text), target pertains to the DOM element to perform the action on, and input specifies input values (e.g., data for filling a form). 2 The sequence of these test operations forms a dataset that is used to infer the initial model. For a test operation with an assertion as its action, we refer to the target DOM element as a checked element, defined as follows: Definition 5 (Checked Element). A checked element ce ∈ vi is an element in the DOM tree in state vi , whose existence, value, or attributes are checked in an assertion of a test case t ∈ T . 2 For example in line 12 of the test case in Figure 2, the text value of the element with ID "mainContent" is asserted and thus that element is a checked element. Part of the DOM structure at this state is shown in Figure 1, which depicts the checked element . For each checked element we record the element location strategy used (e.g., XPath, ID, tagname, linktext, or cssselector) as well as the access values and innerHTML text.

3.

APPROACH

Figure 4 depicts an overview of our approach. At a high level, given the URL of a web application and its humanwritten test suite, our approach mines the existing test suite to infer a model of the covered DOM states and event-based transitions including input values and assertions (blocks 1, 2, and 3). Using the inferred model as input, it explores alternative paths leading to new DOM states, thus expanding the model further (blocks 3 and 4). Next it regenerates assertions for the new states, based on the patterns found in the assertions of the existing test suite (block 5), and finally generates a new test suite from the extended model, which is a superset of the original human-written test suite (block 6). We discuss each of these steps in more details in the following subsections.

3.1

Mining Human-Written Test Cases

To infer an initial model, in the first step, we (1) instrument and execute the human-written test suite T to mine an intermediate dataset of test operations. Using this dataset, we (2) run the test operations to infer a state-flow graph (3) by analyzing DOM changes in the browser after the execution of each test operation. Instrumenting and executing the test suite. We instrument the test suite (block 1 Figure 4) to collect information about DOM interactions such as elements accessed in actions (e.g., clicks) and assertions as well as the structure of the DOM states covered. Definition 2 (Manual-test Path). A manual-test path is the sequence of event-based actions performed while executing a human-written test case t ∈ T . 2

Algorithm 1: State-Flow Graph Inference input : A Web application url U RL, a DOM-based test suite T S, crawling constraints CC output: A state-flow graph SFG Procedure InferSFG(URL,TS,CC ) begin T Sinst ← Instrument(T S) Execute(T Sinst ) TOP ← ReadTestOperationDataset() SFGinit ← ∅ browser.Goto(U RL) dom ← browser.GetDOM() SFGinit .AddInitialState(dom) for top ∈ T OP do C ← GetClickables(top) for c ∈ C do assertion ← GetAssertion(top) dom ← browser.GetDOM() robot.FireEvent(c) new dom ← browser.GetDOM() if dom.HasChanged(new dom) then SFGinit .Update(c, new dom, assertion) browser.Goto(U RL) SFGext ← SFGinit ExploreAlterntivePaths(SFGext ,CC ) return SFGext Procedure ExploreAlterntivePaths(SFG,CC ) begin while ConstraintSatisfied(CC) do s ← GetNextToExploreState(SFG) C ← GetCandidateClickables(s) for c ∈ C do browser.Goto(SF G.GetPath(s)) dom ← browser.GetDOM() robot.FireEvent(c) new dom ← browser.GetDOM() if dom.HasChanged(new dom) then SFG.Update(c, new dom) ExploreAlterntivePaths(SFG,CC )

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

using a DOM string edit distance, or by disregarding specific aspects of a DOM tree (such as irrelevant attributes, time stamps, or styling issues) [19]. The state abstraction plays an important role in reducing the size of SFG since many subtle DOM differences do not represent a proper state change, e.g., when a row is added to a table. Algorithm 1 shows how the initial SFG is inferred from the manual-test paths. First the initial index state is added as a node to an empty SFG (Algorithm 1, lines 5–7). Next, for each test operation in the mined dataset (TOP), it finds DOM elements using the locator information and applies the corresponding actions. If an action is a DOM-based assertion, the assertion is added to the set of assertions of the corresponding DOM state node (Algorithm 1, lines 8–17). The state comparison to determine a new state (line 15) is carried out via a state abstraction function (more explanation in Section 4).

3.2 Exploring Alternative Paths
At this stage, we have a state-flow graph that represents the covered states and paths from the human-written test suite. In order to further explore the web application to find alternative paths and new states, we seed the graph to an automated crawler (block 4 Figure 4). The exploration strategy can be conducted in various ways: (1) remaining close to the manual-test paths, (2) diverging [20] from the manual-test paths, or (3) randomly exploring. However, in this work, we have opted for the first option, namely staying close to the manual-test paths. The reason is to maximize the potential for reuse of and learning from existing assertions. Our insight is that if we diverge too much from the manual-test paths and states, the humanwritten assertions will also be too disparate and thus less useful. To find alternative paths, events are automatically generated on DOM elements and if as a result the DOM is mutated, the new state and the corresponding event transition are added to the SFG. Note that the state comparison to determine a new state (line 29) is carried out via the same state abstraction function used before (line 15). The procedure ExploreAlternativePaths (Algorithm 1, lines 21–31) recursively explores the application until a pre-defined constraint (e.g., maximum time, or number of states) is reached. The algorithm is guided by the manual-test states while exploring alternative paths (Line 22); GetNexToExploreState decides which state should be expanded next. It gives the highest priority to the manual-test states and when all manualtest states are fully expanded, the next immediate states found are explored further. More specifically, it randomly selects a manual-test state that contains unexercised candidate clickables and navigates the application further through that state. The GetCandidateClickable method (Line 23) returns a set of candidate clickables that can be applied on the selected state. This process is repeated until all manualtest states are fully expanded. For example, consider the manual-test sates shown in grey circles in Figure 3. The method starts by randomly selecting a state, e.g., s2, navigating the application to reach to that state from the Index state, and firing an event on s2 resulting in an new state s10.

21 22 23 24 25 26 27 28 29 30 31

This information is later used in the assertion generation process (in Section 3.3). Constructing the initial model. We model a web application as a State-Flow Graph (SFG) [18, 19] that captures the dynamic DOM states as nodes and the event-driven transitions between them as edges. Definition 6 (State-flow Graph). A state-flow graph SF G for a web application W is a labeled, directed graph, denoted by a 4 tuple < r, V, E, L > where: 1. r is the root node (called Index) representing the initial DOM state after W has been fully loaded into the browser. 2. V is a set of vertices representing the states. Each v ∈ V represents an abstract DOM state DS of W, with a labelling function Φ : V → A that assigns a label from A to each vertex in V, where A is a finite set of DOM-based assertions in a test suite. 3. E is a set of (directed) edges between vertices. Each (v1 , v2 ) ∈ E represents a clickable c connecting two states if and only if state v2 is reached by executing c in state v1 . 4. L is a labelling function that assigns a label, from a set of event types and DOM element properties, to each edge. 5. SF G can have multi-edges and be cyclic. 2 An example of such a partial SFG is shown in Figure 3. The abstract DOM state is an abstracted version of the DOM tree of a web application, displayed on the web browser at runtime. This abstraction can be conducted by

3.3 Regenerating Assertions
The next step is to generate assertions for the new DOM states in the extended SFG (block 5 Figure 4). In this work, we propose to leverage existing assertions to regenerate new ones. By analyzing human-written assertions we can infer information regarding (1) portions of the page that are con-

Algorithm 2: Assertion Regeneration input : An extended state-flow graph SFG = < r, V, E, L > Procedure RegenerateAssertions(SFG) begin /*Learn from DOM elements in the manual-test states*/ dataset ←MakeDataset(SFG.GetManualTestStates()) Train(dataset) for si ∈ V do for ce ∈ si .GetCheckedElements() do assert ← ce.GetAssertion() cer ← ce.GetCheckedElementRegion() si .AddRegFullAssertion(cer) for sj ∈ V & sj = si do dom ← sj .GetDOM() /*Generate exact element assertion for sj */ if ElementFullMatched(ce, dom) then sj .ReuseAssertion(ce,assert) else if ElementTagAttMatched(ce, dom) then sj .AddElemTagAttAssertion(ce) /*Generate exact region assertion for sj */ if RegionFullMatched(cer, dom) then sj .AddRegFullAssertion(cer) else if RegionTagAttMatched(cer, dom) then sj .AddRegTagAttAssertion(cer) else if RegionTagMatched(cer, dom) then sj .AddRegTagAssertion(cer) /* Generate similar region assertions for si */ for be ∈ si .GetBlockElements() do if Predict(be) == 1 then si .AddRegTagAttAssertion(be.GetRegion())

Table 1: Summary of the assertion reuse/regeneration conditions for an element ej on a DOM state sj , given a checked element ei on state si .
Condition ElementFullMatched ElementTagAttMatched RegionFullMatched RegionTagAttMatched RegionTagMatched Description T ag(ei )=T ag(ej )∧Att(ei )=Att(ej )∧ T xt(ei )=T xt(ej ) T ag(ei )=T ag(ej ) ∧ Att(ei )=Att(ej ) T ag(R(ei , si ))=T ag(R(ej , sj )) ∧ Att(R(ei , si ))=Att(R(ej , sj )) ∧ T xt(R(ei , si ))=T xt(R(ej , sj )) T ag(R(ei , si ))=T ag(R(ej , sj )) ∧ Att(R(ei , si ))=Att(R(ej , sj )) T ag(R(ei , si ))=T ag(R(ej , sj ))

1 2 3 4 5 6 7 8 9 10 11 12 13

14 15 16 17 18 19

spectively. Suppose that we explore an alternative path for deleting a note with the sequence Index, s1, s2, s10, s4, s5 , which was not originally considered by the developer. Since the two test paths share a common path from Index to s1, the assertion on s1 can be reused for the new test case (note deletion) as well. This is a simple form of assertion reuse on new test paths.

3.3.2

Assertion Regeneration

20 21 22

sidered important for testing; for example, a banner section or decoration parts of a page might not be as important as an inner content that changes according to a main functionality, (2) patterns in the page that might be part of a template. Therefore, extracting patterns from existing assertions may help us in generating new but similar assertions. We formally define a DOM-based assertion as a function A : (s, c) → {True, False}, where s is a DOM state, and c is a DOM condition to be checked. It returns True if s matches/satisfies the condition c, denoted by s |= c, and False otherwise. We say that an assertion A subsumes (implies) assertion B, denoted by A =⇒ B, if A → True, then B → True. This means that B can be obtained from A by weakening A’s condition. In this case, A is more specific/constrained than B. For instance, an assertion verifying the existence of a checked element can be implied by an assertion which verifies both the existence of that element and its attributes/textual values. Algorithm 2 shows our assertion regeneration procedure. We consider each manual-test state si (Definition 3) in the SFG and try to reuse existing associated assertions in si or generate new ones based on them for another state sj . We extend the set of DOM-based assertions in three forms: (1) reusing the same assertions from manual-test states for states without such assertions, (2) regenerating assertions with the exact assertion pattern structure as the original assertions but adapted for another state, and (3) learning structures from the original assertions to generate similar assertions for other states.

We regenerate two types of precondition assertions namely exact element-based assertions, and exact region-based assertions. By “exact” we mean repetition of the same structure of an original assertion on a checked element. The rationale behind our technique is to use the location and properties of checked elements and their close-by neighbourhood in the DOM tree to regenerate assertions, which focus on the exact repeated structures and patterns in other DOM states. This approach is based on our intuition that checking the close-by neighbour of checked elements is just as important. Exact element assertion generation. We define assertions of the form A(sj , c(ej )) with a condition c(ej ) for element ej on state sj . Given an existing checked element (Definition 5) ei on a DOM state si , we consider 2 conditions as follows: 1. ElementFullMatched : If a DOM state sj contains an element with exact tag, attributes, and text value as ei , then reuse assertion on ei for checking ej on sj . 2. ElementTagAttMatched : If a DOM state sj contains an element ej with exact tag and attributes, but different text value as ei , then generate assertion on ej for checking its tag and attributes. Table 1 summarizes these conditions. An example of a generated assertion is assertTrue(isElementPresent(By.i d("mainContent"))) which checks the existence of a checked element with ID "mainContent". Such an assertion can be evaluated in any state in the SFG that contains that DOM element (and thus meets the precondition). Note that we could also propose assertions in case of mere tag matches, however, such assertions are not generally considered useful as they are too generic. Exact region assertion generation. We define the term checked element region to refer to a close-by area around a checked element: Definition 7 (Checked Element Region). For a checked element e on state s, a checked element region R(e, s), is a function R : (e, s) → {e, P(e), Ch(e)}, where P(e) and Ch(e) are the parent node, and children nodes of e respectively. 2 For example, for the element e = (Figure 1), which is in fact a checked

3.3.1

Assertion Reuse

As an example for the assertion reuse, consider Figure 3 and the manual-test path with the sequence of states Index, s1, s2, s3, s4, s5 for adding a note. Assertions in Figure 2 line 7 and 12 are associated to states s1 and s4, re-

element in line 12 of Figure 2 (at state s4 in Figure 3), we have R(e, s4) = {e, P(e), Ch(e)}, where P(e)=, and Ch(e)={, , }. We define assertions of the form A(sj , c(R(ej , sj ))) with a condition c(R(ej , sj )) for the region R of an element ej on state sj . Given an existing checked element ei on a DOM state si , we consider 3 conditions as follows: 1. RegionFullMatched : If a DOM state sj contains an element ej with exact tag, attributes, and text values of R(ej , sj ) as R(ei , si ), then generate assertion on R(ej , sj ) for checking its tag, attributes, and text values. 2. RegionTagAttMatched : If a DOM state sj contains an element ej with exact tag, and attributes values of R(ej , sj ) as R(ei , si ), then generate assertion on R(ej , sj ) for checking its tag and attributes values. 3. RegionTagMatched : If a DOM state sj contains an element ej with exact tag value of R(ej , sj ) as R(ei , si ), then generate assertion on R(ej , sj ) for checking its tag value. Note that the assertion conditions are relaxed one after another. In other words, on a DOM state s, if s |= RegionF ullM atched, then s |= RegionT agAttM atched; and if s |= RegionT agAttM atched, then we have s |= RegionT agM atched. Consequently it suffices to use the most constrained assertion. We use this property for reducing the number of generated assertions in subsubsection 3.3.4. Table 1 summarizes these conditions. Assertions that we generate for a checked element region, are targeted around a checked element. For instance, to check if a DOM state contains a checked element region with its tag, attributes, and text values, an assertion will be generated in the form of assertTrue(isElementRegionFullPresent(parentElement, element, childrenElements)), where parentElement, element, and childrenElements are objects reflecting information about that region on the DOM. For each checked element ce on si , we also generate a RegionFull type of assertion for checking its region, i.e., verifying RegionFullMatched condition on si (Algorithm 2 line 5). Lines 10–13 perform exact element assertion generation. The original assertion can be reused in case of ElementFullMatched (line 11). Lines 14–19 apply exact region assertion generation based on the observed matching. Notice the hierarchical selection which guarantees generation of more specific assertions.

to train a classifier based on the features of the checked elements in existing assertions. More specifically, given a training dataset D of n DOM elements in the form D = {(xi , yi ) | xi ∈ Rp , yi ∈ {−1, 1}}n , where each xi is a pi=1 dimensional real vector representing the features of a DOM element ei , and yi indicates whether ei is a checked element (+1) or not (−1), the classification function F : xj → yi maps a feature vector xj to its class label yj . To do so, we use Support Vector Machine (SVM) [32] to find the maxmargin hyperplane that divides the elements with yi = 1 from those with yi = −1. In the rest of this subsection, we describe our used features, how to label the feature vectors, and how to generate similar region DOM-based assertions. DOM element features. We present a set of features for a DOM element to be used in our classification task. A feature extraction function ψ : e → x maps an element e to its feature set x. Many of these features are based on and adapted from the work in [29], which performs page segmentation ranking for adaptation purpose. The work presented a number of spatial and content features that capture the importance of a webpage segment based on a comprehensive user study. Although they targeted a different problem than ours, we gained insight from their empirical work and use that to reason about the importance of a page segment for testing purposes. Our proposed DOM features are presented in Table 2. We normalize feature values between [0–1] as explained in Table 2, to be used in the learning phase. For example, consider the element e = in Figure 1, then ψ(e) = corresponding to features BlockCenterX, BlockCenterY, BlockWidth, BlockHeight, TextImportance, InnerHtmlLength, LinkNum, and ChildrenNum, respectively. Labelling the feature vectors. For the training phase, we need a dataset of feature vectors for DOM elements annotated with +1 (important to be checked in assertion) and -1 (not important for testing) labels. After generating a feature vector for each “checked DOM element”, we label it by +1. For some elements with label -1, we consider those with “most frequent features” over all the manual-test states. Unlike previous work that focuses on DOM invariants [25], our insight is that DOM subtrees that are invariant across manual-test states, are less important to be checked in assertions. In fact, most modern web applications execute a significant amount of client-side code in the browser to mutate the DOM at runtime; hence DOM elements that remain unchanged across application execution are more likely to be related to fixed (server-side) HTML templates. Consequently, such elements are less likely to contain functionality errors. Thus, for our feature vectors we consider all block elements (such as div, span, table) on the manualtest states and rank them in a decreasing order based on their occurrences. In order to have a balanced dataset of items belonging to {-1,+1}, we select the k-top ranked (i.e., k most frequent) elements with label -1, were k equals the number of label +1 samples. Predicting new DOM elements. Once the SVM is trained on the dataset, it is used to predict whether a given DOM element should be checked in an assertion (algorithm 2, Lines 20–23). If the condition F(ψ : e → x)=1 holds, we generate a RegionTagAtt type assertion (i.e., checking tag and attributes of a region). We do not consider a RegionFull (i.e., checking tag, attributes, and text of a region) assertion type in this case because we are dealing with a similar detected region, not an exact one. Also, we do not generate a RegionTag assertion type because a

3.3.3

Learning Assertions for Similar Regions

The described exact element/region assertion regeneration techniques only consider the exact repetition of a checked element/region. However, there might be many other DOM elements that are similar to the checked elements but not exactly the same. For instance, consider Figure 2 line 12 in which a element was checked in an assertion. If in another state, a element exists, which is similar to the element in certain aspects such as content and position on the page, we could generate a DOM-based assertion for the element in the form of assertTrue(i sElementPresent(By.id("centreDiv")));. We view the problem of generating similar assertions as a classification problem which decides whether a block level DOM element is important to be checked by an assertion or not. To this end, we apply machine learning

Table 2: DOM element features used to train a classifier.
Feature Name ElementCenterX, ElementCenterY Definition The (x,y) coordinates of the centre of a DOM element. BlockCenterX and BlockCenterY are normalized by dividing by PageWidth and PageHeight (i.e., the width and height of the whole page) respectively. These are the width and height of the DOM element, which are also normalized by dividing by PageWidth and PageHeight, respectively. This binary value feature indicates whether the block element contains any visually important text. The innerHtmlLength is the length of all HTML code string (without whitespace) in the element block. We normalize this value by dividing it by InnerHtmlLength of the whole page. The LinkNum is the number of anchor (hyperlink) elements inside the DOM element and is normalized by the link number of the whole page. The ChildrenNum is the number of child nodes under a DOM node. We normalize this value by dividing it by a constant number (10 in our implementation) and setting the normalized value to 1 if it exceeds 1. Rationale Web designers typically put the most important information (main content) in the centre of the page, the navigation bar on the header or on the left side, and the copyright on the footer [29]. Thus, if the (x,y) coordinate of the centre of a DOM block is close to the (x,y) coordinate of the web page centre, that block is more likely to be part of the main content. The width and height of an element can be an indication for an important segment. Intuitively, large blocks typically contain much irrelevant noisy content [29]. Text in bold/italic style, or header elements (such as h1, h2,..., h5) to highlight and emphasize textual content usually imply importance in that region. The normalized feature value can indicate the block content size. Intuitively, blocks with many sub-blocks and elements are considered to be less important than those with fewer but more specific content [29]. If a DOM region contains clickables, it is likely part of a navigational structure (menu) and not part of the main content [29]. We have observed in many DOM-based test cases that checked elements do not have a large number of children nodes. Therefore, this feature can be used to discourage elements with many children to be selected for a region assertion, to enhance test readability.

ElementWidth, ElementHeight TextImportance InnerHtmlLength

LinkNum ChildrenNum

higher priority should be given to the similar region-based assertions.

3.3.4

Assertion Minimization

The proposed assertion regeneration technique can generate many DOM-based assertions per state, which in turn can make the generated test method hard to comprehend and maintain. Therefore, we (1) avoid generating redundant assertions, and (2) prioritize assertions based on their constraints and effectiveness. Avoiding redundant assertions. A new reused/generated assertion for a state (Algorithm 2, lines 5, 11, 13, 15, 17, 19, and 22), might already be subsumed by, or may subsume other assertions, in that state. For example an exact element assertion which verifies the existence of a checked element can be subsumed by an exact region assertion which has the same span element in either its checked element, parent, or its children nodes. Assertions that are subsumed by other assertions are redundant and safely eliminated to reduce the overhead in testing time and increase the readability and maintainability of test cases. For a given state s with an existing assertion B, a new assertion A generated for s is treated as follows: Discard A ; if B =⇒ A Replace B with A ; if A =⇒ B ∧ B ∈ original assertions Add A to s ; otherwise Prioritizing assertions. We prioritize the generated assertions such that given a maximum number of assertions to produce per state, the more effective ones are ranked higher and chosen. We prioritize assertions in each state in the following order; the highest priority is given to the original human-written assertions. Next are the reused, the RegionFull, the RegionTagAtt, the ElementTagAtt, and the RegionAtt assertions. This ordering gives higher priorities to more specific/constrained assertions first.

and attributes) about related DOM elements is generated as code comments. After generating the extended test suite, we make sure that the reused/regenerated assertions are stable, i.e., do not falsely fail, when running the test suite on an unmodified version of the web application. Some of these assertions are not only DOM related but also depend on the specific path through which the DOM state is reached. Our technique automatically identifies and filters these false positive cases from the generated test suite. This is done through executing the generated test suite and eliminating failing assertions form the test cases iteratively, until all tests pass successfully.

4. IMPLEMENTATION
The approach is implemented in a tool, called Testilizer, which is publicly available [7]. The state exploration component is built on top of Crawljax [18]. Testilizer requires as input the source code of the human-written test suite and the URL of the web application. Testilizer currently supports Selenium tests, however, our approach can be easily applied to other DOM-based tests as well. To instrument the test cases, we use JavaParser [2] to get a abstract syntax tree. We instrument all DOM related method calls and calls with arguments that have DOM element locaters. We also log the DOM state after every event in the tests, capable of changing the DOM. For the state abstraction function (as defined in Definition 1), we generate an abstract DOM state by ignoring recurring structures (patterns such as table rows and list items), textual content (such as ignoring the text node “Note has been created” in the partial DOM shown in Figure 1), and contents in the tags. For the classification step, we use LIBSVM [12], which is a popular library for support vector machines.

5. EMPIRICAL EVALUATION
To assess the efficacy of our proposed technique, we have conducted a controlled experiment to address the following research questions: RQ1 How much of the information (input data, event sequences, and assertions) in the original human-written test suite is leveraged by Testilizer? RQ2 How successful is Testilizer in regenerating effective assertions?

3.4

Test Suite Generation

In the final step, we generate a test suite from the extended state-flow graph. Each path from the Index node to a sink node (i.e., node without outgoing edges) in the SFG is transformed into a unit test. Loops are included once. Each test case captures the sequence of events as well as any assertions for the target states. To make the test case more readable for the developers, information (such as tag name

Table 3: Experimental objects.
Name Claroline e-learning (1.11.7) PhotoGallery (3.31) WolfCMS (0.7.8) EnterpriseStore (1.0.0) SLOC PHP (295K) JS (36K) PHP (5.6K) JS (1.5K) PHP (35K) JS (1.3K) Java (3K) JS (57K) #Test Methods 23 7 12 19 #Assertions 35 18 42

Table 4: Test suite generation methods evaluated.
Test Suite Generation Method ORIG Testilizer (EXND+AR) EXND+RND 17 RAND+RND Action Sequence Generation Method Manual Traversing paths in the extended SFG generated from the original tests Traversing paths in the extended SFG generated from the original tests Traversing paths in the SFG generated by random crawling Assertion Generation Method Manual Assertion regeneration Random Random

RQ3 Does Testilizer improve coverage? Our experimental data along with the implementation of Testilizer are available for download [7]. RAND crawling, after a clickable element on a state was exercised, the crawler resets to the index page and continues crawling from another chosen state. Maximum number of generated assertions. We constrain the maximum number of generated assertions for each state to five. To have a fair comparison, for the EXND+RND and RAND+RND methods, we perform the same assertion prioritization used in Testilizer and select the top ranked. Learning parameters. We set the SVM’s kernel function to the Gaussian RBF, and use 5-fold cross-validation for tuning the model and feature selection.

5.1

Experimental Objects

We selected four open source web applications that make extensive use of client-side JavaScript, fall under different application domains, and have Selenium test cases. The experimental objects and their properties are shown in Table 3. Claroline [1] is a collaborative e-learning environment, which allows instructors to create and administer courses. Phormer [5] is a photo gallery equipped with upload, comment, rate, and slideshow functionalities. WolfCMS [8] is a content management system. EnterpriseStore [9] is an enterprise asset management web application.

5.2.2

Dependent Variables

5.2

Experimental Setup

Our experiments are performed on Mac OS X, running on a 2.3GHz Intel Core i7 CPU with 8 GB memory, and FireFox 28.0.

5.2.1

Independent Variables

We compare the original human-written test suites with the test suites generated by Testilizer. Test suite generation method. We evaluate different test suite generation methods for each application as presented in Table 4. We compare Testilizer (EXND+AR) with three baselines, (1) ORIG: original human-written test suite, (2) EXND+RND: test suite generated by traversing the extended SFG, equipped with random assertion generation, and (3) RAND+RND: random exploration and random assertion generation. In random assertion generation, for each state we generate element/region assertions by randomly selecting from a pool of DOM-based assertions. These random assertions are based on the existence of an element/region in a DOM state. Such assertions are expected to pass as long as the application is not modified. However, due to our state abstraction this can result in unstable assertions, which are also automatically eliminated following the approach explained in subsection 3.4. We further evaluate various instantiations of our assertion generation in EXND+AR, i.e., using only (a) original assertions, (b) reused assertions (Section 3.3.1), (c) exact generated (Section 3.3.2), (d) similar region generated (Section 3.3.3), and (e) a combination of all these types. Exploration constraints. We confine the exploration time to five minutes in all the experiments, which should be acceptable in most testing environments. Suppose in the EXND approach, Testilizer spends time t to generating the initial SFG for an application. To make a fair comparison, we add this time t to the five minutes for the RAND exploration approach. We set no limits on the crawling depth nor the maximum number of states to be discovered while looking for alternative paths. Note that for both EXND and

Original coverage. To assess how much of the information including input data, event sequences, and assertions of the original test suite is leveraged (RQ1), we measure the state and transition coverage of the initial SFG (i.e., SFG mined from the original test cases). We also measure how much of the unique assertions and unique input data in the original test cases has been utilized. Fault detection rate. To answer RQ2 (assertions effectiveness), we evaluate the DOM-based fault detection capability of Testilizer through automated first-order mutation analysis. The test suites are evaluated based on the number of detected mutants by test assertions. We apply the DOM, jQuery, and XHR mutation operators at the JavaScript code level as described in [21], which are based on a study of common mistakes made by web developers. Examples include changing the ID/tag name used in getElementById and getElementByTagName methods, changing the attribute name/value in setAttribute, getAttribute and removeAttribute methods, removing the $ sign that returns a jQuery object, changing the name of the property/class/element in the addClass, removeClass, removeAttr, remove, attr, and css methods in jQuery, swapping innerHTML and innerText properties, and modifying the XHR type (Get/Post). On average we generate 36 mutant versions for each application. Code coverage. Code coverage has been commonly used as an indicator of the quality of a test suite by identifying under-tested parts, while it does not directly imply the effectiveness of a test suite [16]. Although Testilizer does not target code coverage maximization, to address RQ3, we compare the JavaScript code coverage of the different test suites using JSCover [3].

5.3 Results
Original SFG Coverage (RQ1). Table 5 shows the average results of our experiments. As expected, the number of states, transitions, and generated test cases are higher in Testilizer. The random exploration (RAND) on average generates fewer states and transitions, but more test cases

Table 5: Results showing statistics of the test models and original test suite information usage, average over experimental objects.
Orig Transition Coverage Orig Input Data Usage Orig Assertion Usage

Test Suite ORIG EXND RAND

A major source of this instability is the selection of dynamic DOM elements in the generated assertions. For instance, RND (random assertion generation) selects many DOM elements with dynamic time-based attributes. Also the more restricted an assertion is, the less likely it is to remain stable in different paths. This is the case for some of the (1) reused assertions that replicate the original assertions and (2) exact generated ones specially FullRegionMatchs type. On the other hand, learned assertions are less strict (e.g., AttTagRegionMatchs) and are thus more stable. Overall, the test suite generated by Testilizer, on average, consists of 12% original assertions, 11% reused assertions, 31% exact generated assertions, and 45% of similar learned assertions.
30%

Orig State Coverage

# Transitions

37 54 33

46 63 40

15 47 25

100% 98% 65%

100% 96% 60%

100% 100% 0%

100% 100% 0%

20% 26% 22%
Fault
DetecEon Rate

JS Code Coverages

# Test Cases

# States

25% 20% 15% 10% 5% 0% ORIG EXND + Original EXND + Reused EXND + EXND + EXND + EXND + Exact Similar Combined RND Generated Generated (TesElizer) RAND + RND

compared to the original test suite. This is mainly due to the fact that in the SFG generated by RAND, there are more paths from Index to the sink nodes than in the SFG mined from the original test suite. Regarding the usage of original test suite information (RQ1), as expected Testilizer, which leverages the event sequences and inputs of the original test suite, has almost full state (98%) and transition (96%) coverage of the initial model. The few cases missed are due to the traversal algorithm we used, which has limitations on dealing with cycles in the graph that do not end with a sink node and thus are not generated. Note that we can select the missing cases from the original manual-written test suite and add them to the generated test suite. By analyzing the generated test suites, we found that on average, Testilizer reused 22 input values (in addition to the login data) from the average of 15 original inputs. The RAND exploration approach covered about 60% of the states and transitions, without any usage of input data (apart from the login data, which was provided to RAND manually).

Figure 6: Comparison of average fault detection rate using different test suite generation methods. Fault detection (RQ2). Figure 6 depicts a comparison of fault detection rates for the different methods. Figure 6 shows that exact and similar generated assertions are more effective than original and reused ones. The effectiveness of each assertion generation technique solely is not more than the random approach. This is mainly due to the fact that the number of random assertions per state is more than the assertions reused/generated by Testilizer, since we always select 5 random assertions at each state from a pool of assertions but not always find 5 exact/similar match in a state. More importantly, the results show that Testilizer outperforms fault detection capability of the original test suite by 150% (15% increase) and the random methods by 37% (7% increase). This supports our insight that leveraging input values and assertions from human-written test suites can be helpful in generating more effective test cases. Code Coverage (RQ3). Although code coverage improvement is not the main goal of Testilizer in this work, the generated test suite has a slightly higher code coverage. As shown in Table 5, there is a 30% improvement (6% increase) over the original test suite and 18% improvement (4% increase) over the RAND test suite. Note that the original test suites were already equipped with proper input data, but not many execution paths (thus the slight increase). On the other hand, the random exploration considered more paths in a blind search, but without proper input data.

Before Filtering 5 Avg # Asse(ons per State 4 3 2 1 0 EXND + Original EXND + Reused

AKer Filtering

EXND + EXND + EXND + Exact Similar Combined Generated Generated (TesDlizer)

EXND + RND

RAND + RND

Figure 5: Average number of assertions per state, before and after filtering unstable assertions. Figure 5 presents the average number of assertions per state before and after filtering the unstable ones. The difference between the number of actual generated assertions and the stable ones reveals that our generated assertions (combined, similar/exact generated) are more stable than the random approach. The reduction percentage is 25%, 49%, 22%, 11%, 20%, 35%, and 45% for the original, reused, exact generated, similar generated, combined (Testilizer), EXND+RND and RAND+RND, respectively.

5.4 Discussion
Test case dependencies. An assumption made in Testilizer is that the original test suite does not have any test case dependencies. Generally, test cases should be executable without any special order or dependency on previous tests. However, while conducting our evaluation, we came across multiple test suites that violated this principle. For such cases, although Testilizer can generate test cases, failures can occur due to these dependencies.

Effectiveness. The effectiveness of the generated test suite depends on multiple factors. First, the size and the quality of the original test suite is very important; if the original test suite does not contain paths with effective assertions, it is not possible to generate an effective extended test suite. In the future we plan to use other adequacy metrics, such as DOM coverage [22], to measure the quality of a given test suite. Second, the learning-based approach can be tuned in various ways (e.g., selecting other features, changing the SVM parameters, and choosing sample dataset size) to obtain better results. Third, the size of the DOM subtree (region) to be checked can be increased to detect changes more effectively, however, it might come at the cost of making the test suite more brittle. Efficiency. The larger a test suite, the more time it takes to test an application. Since in many testing environments time is limited, not all possible paths of events should be generated in the extended test suite. The challenge is finding a balance between effectiveness and efficiency of the test cases. The current graph traversal method in Testilizer may produce test cases that share common paths, which do not contribute much to fault detection or code coverage. An optimization could be realized by guiding the test generation algorithm towards states that have more constrained DOMbased assertions. Threats to validity. Although Selenium is widely used in industry for testing commercial web applications, unfortunately, very few open source web applications are publicly available that have (working) Selenium test suites. Therefore, we were able to include a limited number of applications in our study. A threat to the external validity of our experiment is with regard to the generalization of the results to other web applications. To mitigate this threat, however, we selected our experimental objects from different domains with variations in functionality and structure. With respect to reproducibility of our results, Testilizer, the test suites, and the experimental objects are publicly available, making the experiment reproducible.

technique to guide the exploration at runtime towards more coverage and higher navigational and structural diversity. These approaches, however, do not use information in existing test cases, and they do not address the problem of test oracle generation. Yoo and Harman [37] propose a search-based approach to reuse and regenerate existing test data for primitive data types. They show that the knowledge of existing test data can help to improve the quality of new generated test data. Alshahwan and Harman [10] generate new sequences of HTTP requests through a def-use analysis of server-side code. Pezze et al. [26] present a technique to generate integration test cases from existing unit test cases. Mirzaaghaei et al. [23] use test adaptation patterns in existing test cases to support test suite evolution. This work is also related to test suite augmentation techniques [36, 27] used in regression testing. In test suite augmentation the goal is to generate new test cases for the changed parts of the application. More related to our work is [33], which aggregates tests generated by different approaches using a unified test case language. They propose a test advice framework that extracts information in the existing tests to help improve other tests or test generation techniques. A generic approach used often as a test oracle is checking for thrown exceptions and application crashes [38]. This is, however, not very helpful for web applications as they do not crash easily and the browser continues the execution even after exceptions. Current web testing techniques simplify the test oracle problem in the generated test cases by using soft oracles, such as generic user-defined oracles, and HTML validation [19, 11]. Our work is different from these approaches in that we (1) reuse knowledge in existing human-written test cases in the context of web application testing, (2) reuse input values and event sequences in test cases to explore alternative paths and news states of web application, and (3) reuse oracles of the test cases for regenerating assertions to improve the fault finding capability of the test suite.

6. RELATED WORK
Elbaum et al. [15] leverage user-sessions for web application test generation. Based on this work, Sprenkle et al. [30] propose a tool to generate additional test cases based on the captured user-session data. McAllister et al. [17] leverage user interactions for web testing. Their method relies on prerecorded traces of user interactions and requires instrumenting one specific web application framework. None of these techniques considers leveraging knowledge from existing test cases as Testilizer does. Xie and Notkin [34] infer a model of the application under test by executing the existing test cases. Dallmeier et al. [14] mine a specification of desktop systems by executing the test cases. Schur et al. [28] infer behaviour models from enterprise web applications via crawling. Their tool generates test cases simulating possible user inputs. Similarly, Xu et al. [35] mine executable specifications of web applications from Selenium test cases to create an abstraction of the system. Yuan and Memon [39] propose an approach to iteratively rerun automatically generated test cases for generating alternating test cases. This is inline with feedbackdirected testing [24], which leverages dynamic data produced by executing the program using previously generated test cases. For instance, Artemis [11] is a feedback-directed tool for automated testing of JavaScript applications that uses generic oracles such as HTML validation. Our previous work, FeedEx [20], applies a feedback-directed exploration

7. CONCLUSIONS AND FUTURE WORK
This work is motivated by the fact that a human-written test suite is a valuable source of domain knowledge, which can be used to tackle some of the challenges in automated web application test generation. Given a web application and its DOM-based (such as Selenium) test suite, our tool, called Testilizer, utilizes the given test suite to generate effective test cases by exploring alternative paths of the application, and regenerating assertions for new detected states. Our empirical results on four real-world applications show that Testilizer easily outperforms a random test generation technique, provides substantial improvements in the fault detection rate compared with the original test suite, while slightly increasing code coverage too. For future work, we plan to evaluate the effectiveness of other state space exploring strategies, e.g., diversification of test-paths, and investigate correlations between the effectiveness of the original test suite and the generated test suite.

8. ACKNOWLEDGMENTS
This work was supported by the National Science and Engineering Research Council of Canada (NSERC) through its Strategic Project Grants programme and Alexander Graham Bell Canada Graduate Scholarship, and Swiss National Science Foundation (PBTIP2145663).

9.

REFERENCES
[21]

[1] Claroline. http://www.claroline.net/. [2] JavaParser. https://code.google.com/p/javaparser/. [3] Jscover. http://tntim96.github.io/JSCover/. [4] Organizer. http://www.apress.com/9781590596951. [5] Phormer Photogallery. http://sourceforge.net/projects/rephormer/. [6] Selenium HQ. http://seleniumhq.org/. [7] Testilizer. http://salt.ece.ubc.ca/software/testilizer. [8] WolfCMS. https://github.com/wolfcms/wolfcms. [9] WSO2 EnterpriseStore. https://github.com/wso2/enterprise-store. [10] N. Alshahwan and M. Harman. State aware test case regeneration for improving web application test suite coverage and fault detection. In Proceedings of the International Symposium on Software Testing and Analysis (ISSTA), pages 45–55, 2012. [11] S. Artzi, J. Dolby, S. Jensen, A. Møller, and F. Tip. A framework for automated testing of JavaScript web applications. In Proceedings of the International Conference on Software Engineering (ICSE), pages 571–580. ACM, 2011. [12] C.-C. Chang and C.-J. Lin. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2:27:1–27:27, 2011. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm. [13] S. R. Choudhary, M. Prasad, and A. Orso. Crosscheck: Combining crawling and differencing to better detect cross-browser incompatibilities in web applications. In Proc. International Conference on Software Testing, Verification and Validation (ICST), pages 171–180. IEEE Computer Society, 2012. [14] V. Dallmeier, N. Knopp, C. Mallon, S. Hack, and A. Zeller. Generating test cases for specification mining. In Proceedings of the International Symposium on Software Testing and Analysis (ISSTA), pages 85–96, 2010. [15] S. Elbaum, G. Rothermel, S. Karre, and M. Fisher. Leveraging user-session data to support web application testing. IEEE Transactions on Software Engineering, 31(3):187–202, 2005. [16] L. Inozemtseva and R. Holmes. Coverage is not strongly correlated with test suite effectiveness. In Proceedings of the International Conference on Software Engineering (ICSE), 2014. [17] S. McAllister, E. Kirda, and C. Kruegel. Leveraging user interactions for in-depth testing of web applications. In Recent Advances in Intrusion Detection, volume 5230 of LNCS, pages 191–210. Springer, 2008. [18] A. Mesbah, A. van Deursen, and S. Lenselink. Crawling Ajax-based web applications through dynamic analysis of user interface state changes. ACM Transactions on the Web (TWEB), 6(1):3:1–3:30, 2012. [19] A. Mesbah, A. van Deursen, and D. Roest. Invariant-based automatic testing of modern web applications. IEEE Transactions on Softw. Eng., 38(1):35–53, 2012. [20] A. Milani Fard and A. Mesbah. Feedback-directed exploration of web applications to derive test models. In Proceedings of the International Symposium on

[22]

[23]

[24]

[25]

[26]

[27]

[28]

[29]

[30]

[31]

[32] [33]

[34]

[35]

Software Reliability Engineering (ISSRE), pages 278–287. IEEE Computer Society, 2013. S. Mirshokraie, A. Mesbah, and K. Pattabiraman. Efficient JavaScript mutation testing. In Proc. of the International Conference on Software Testing, Verification and Validation (ICST). IEEE Computer Society, 2013. M. Mirzaaghaei and A. Mesbah. DOM-based test adequacy criteria for web applications. In Proceedings of the International Symposium on Software Testing and Analysis (ISSTA), pages 71–81. ACM, 2014. M. Mirzaaghaei, F. Pastore, and M. Pezze. Supporting test suite evolution through test case adaptation. In Proceedings of the International Conference on Software Testing, Verification and Validation (ICST), pages 231–240. IEEE Computer Society, 2012. C. Pacheco, S. K. Lahiri, M. D. Ernst, and T. Ball. Feedback-directed random test generation. In Proc. International Conference on Software Engineering (ICSE), pages 75–84. IEEE Computer Society, 2007. K. Pattabiraman and B. Zorn. DoDOM: Leveraging DOM invariants for web 2.0 application robustness testing. In Proceedings of the International Symposium on Sw. Reliability Eng. (ISSRE), pages 191–200. IEEE Computer Society, 2010. M. Pezze, K. Rubinov, and J. Wuttke. Generating effective integration test cases from unit ones. In Proc. International Conference on Software Testing, Verification and Validation (ICST), pages 11–20. IEEE, 2013. K. Rubinov and J. Wuttke. Augmenting test suites automatically. In Proceedings of the 2012 International Conference on Software Engineering, ICSE 2012, pages 1433–1434, Piscataway, NJ, USA, 2012. IEEE Press. M. Schur, A. Roth, and A. Zeller. Mining behavior models from enterprise web applications. In Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering, Proceedings of the Foundations of Software Engineering (ESEC/FSE), pages 422–432. ACM, 2013. R. Song, H. Liu, J.-R. Wen, and W.-Y. Ma. Learning important models for web page blocks based on layout and content analysis. ACM SIGKDD Explorations Newsletter, 6(2):14–23, 2004. S. Sprenkle, E. Gibson, S. Sampath, and L. Pollock. Automated replay and failure detection for web applications. In Proceedings of the ACM/IEEE International Conference on Automated Software Engineering (ASE), pages 253–262. ACM, 2005. S. Thummalapenta, K. V. Lakshmi, S. Sinha, N. Sinha, and S. Chandra. Guided test generation for web applications. In Proceedings of the International Conference on Software Engineering (ICSE), pages 162–171. IEEE Computer Society, 2013. V. Vapnik. The nature of statistical learning theory. springer, 2000. Y. Wang, S. Person, S. Elbaum, and M. B. Dwyer. A framework to advise tests using tests. In Proc. of ICSE NIER. ACM, 2014. T. Xie and D. Notkin. Mutually enhancing test generation and specification inference. In Formal Approaches to Software Testing, pages 60–69. Springer, 2004. D. Xu, W. Xu, B. K. Bavikati, and W. E. Wong. Mining executable specifications of web applications from selenium ide tests. In Software Security and

Reliability (SERE), 2012 IEEE Sixth International Conference on, pages 263–272. IEEE, 2012. [36] Z. Xu, Y. Kim, M. Kim, G. Rothermel, and M. B. Cohen. Directed test suite augmentation: techniques and tradeoffs. In Proceedings of the International Symposium on Foundations of Software Engineering (FSE), pages 257–266. ACM, 2010. [37] S. Yoo and M. Harman. Test data regeneration: generating new test data from existing test data. Software Testing, Verification and Reliability, 22(3):171–201, 2012.

[38] X. Yuan and A. M. Memon. Using GUI run-time state as feedback to generate test cases. In Proceedings of the 29th International Conference on Software Engineering, ICSE ’07, pages 396–405, Washington, DC, USA, 2007. IEEE Computer Society. [39] X. Yuan and A. M. Memon. Iterative execution-feedback model-directed GUI testing. Information and Software Technology, 52(5):559–575, 2010.

Similar Documents

Premium Essay

Software Quality Assurance

...Chapter 16 – Software Quality Assurance Overview This chapter provides an introduction to software quality assurance. Software quality assurance (SQA) is the concern of every software engineer to reduce costs and improve product time-to-market. A Software Quality Assurance Plan is not merely another name for a test plan, though test plans are included in an SQA plan. SQA activities are performed on every software project. Use of metrics is an important part of developing a strategy to improve the quality of both software processes and work products. Software Quality Assurance • Umbrella activity applied throughout the software process • Planned and systematic pattern of actions required to ensure high quality in software • Responsibility of many stakeholders (software engineers, project managers, customers, salespeople, SQA group) SQA Questions • Does the software adequately meet its quality factors? • Has software development been conducted according to pre-established standards? • Have technical disciplines performed their SQA roles properly? Quality Assurance Elements • Standards – ensure that standards are adopted and follwed • Reviews and audits – audits are reviews performed by SQA personnel to ensure hat quality guidelines are followed for all software engineering work • Testing – ensure that testing id properly planned and conducted • Error/defect collection and analysis – collects and analyses error and defect data to better...

Words: 999 - Pages: 4

Free Essay

A Software Quality Assurance and Management System

...A Software Quality Assurance and Management System Er.Ashish Kumar Tripathi, Er. Sachin Kumar Dhar Dwivedi, Mr. Saurabh Upadhyay Abstract- Software quality objectives covers a variety of techniques and measurements, including gathering code metrics, enforcing coding rules, and proving the absence of run-time errors. The guide also takes into account the origin of the code, its stage in the software life cycle, and the safety aspects of the application. The guide explains how to gradually adapt the code verification process to achieve targeted quality objectives Index Terms- Application and System S/W, System efficiency, testing and good designing tools. -------------------------------------------------------- 1-Introduction Quality assurance for automotive systems can require different types of verification activities throughout the development process.  Early verification focuses on evaluating intermediate software builds and removing defects at coding time. This represents an emerging trend because performing verification early in the process can improve overall quality and reduce development time.  Post-production verification focuses on evaluating final build quality or finding defect root causes after the product is complete. This is the most common approach to automotive system verification. 3-About Software ProductsSoftware products may be • Custom - developed for a particular customer, according to its specifications Generic (“package”) - developed for...

Words: 1104 - Pages: 5

Premium Essay

Software Quality Assurance

...Introduction………………………………………………………..…..1-2 1.1 Software Quality…………………………………………..……..2 1.2 Software Quality Assurance……………………………..……….2 1.3 Software Testing………………………………………..………2-3 2. Software Quality Assurance Tools………………………………..…….3-7 3. Selected QA Tools…………………………………………………………8 3.1 Selenium IDE…………………………………………………..9-14 3.2 FabaSoft………………………………………………………15-19 4. Future and Conclusion……………………………………………………20   1. Introduction Software Quality Assurance (SQA) Tools play a major role in common software user community and in the field of software development. SQA tools are the specially developed software tools for the purpose of assisting in the QA processes. 1.1 Software Quality The quality of software is assessed by a number of variables. These variables can be divided into external and internal quality criteria. External quality is what the user experiences when running the software in its operational mode. Internal quality refers to the aspects that are code-dependent, and are not visible to the end-user. External quality is critical to the user, while internal quality is meaningful to the developer only. When the quality comes to software quality conformance, requirement consider as software functional quality and fitness for use named as software structural quality. 1.2 Software Quality Assurance Software quality assurance (SQA) consists of a means of monitoring the software engineering process and methods used to ensure quality. The methods by which this are accomplished...

Words: 2186 - Pages: 9

Premium Essay

Software Quality Assurance and Testing Methodologies

...CSC 415 / Software Engineering Date of Submission: September 20, 2013 Software Quality Assurance and Testing Techniques Subject: This memo delves into unit testing, integration testing, and other aspects of software quality assurance. Software quality assurance is essential for any business that provides products that are to be used by others. This memo serves as a documentation of software testing methodologies that support a recommendation for GF Software Solutions approach to SQA. Addressed are various components of the software development lifecycle and the different techniques that exist for their implementation. Unit tests focus on the verification of the smallest unit of software design, or the software module. Integration testing can be performed using a number of different methods including top-down, bottom-up, and big bang integration. Integration tests address problems or bugs that may occur when interfacing individual software components and building the complete system architecture. System tests are applied when the software is incorporated with other systems elements such as hardware, people, or information. Stress testing is a method of determining the stability of an application when tested beyond the bounds of normal operation. Each of the aforementioned steps is crucial to the software quality assurance procedure and is defined more thoroughly in the following sections. A critical component of software quality assurance...

Words: 1958 - Pages: 8

Premium Essay

A Framework for Software Quality Assurance Using Agile Methodology

...2277-8616 44 IJSTR©2015 www.ijstr.org A Framework For Software Quality Assurance Using Agile Methodology Maria Sagheer, Tehreem Zafar, Mehreen Sirshar Abstract: Agile methodology that uses iterative and incremental approach is widely used in the industry projects as they can satisfy to the change of requirements. Efficient product delivery is achieved by using short iterations. But assuring the quality in any system is essential and imperative part which is very difficult task, which raises a question that: Is quality is assured in the software which is developed using agile methodology? The research paper proposed a solution and summarizes that how the quality is achieved or assure in agile software development using different factors. The major focus of this research paper is to analyse the quality parameter which assure quality in the agile methodology. The research paper includes the brief overview of the parameters which assure quality in agile. ———————————————————— I. INTRODUCTION. For the successful software engineering, the delivery of high quality software is needed. Mainly the customer satisfaction level is considered as the quality attribute which defines high quality of any system. Traditionally waterfall approach is used for the software development in which system is developed by freezing the requirements. To achieve high quality Agile methods rather than waterfall approach for the software development are adapted by many organizations to compete...

Words: 5336 - Pages: 22

Premium Essay

Qa and Qc

...Difference between QA and QC As we've talked in a previous post about the Definition of Quality, with these 2 terms exist the same “issue” where every single person/organization defines Quality Assurance (QA) and Quality Control (QC) in a bit different way. Also, many people including HHRR (Human Resources) and quality professionals do not know what QA and QC really means, and what the difference is between both terms. For those reasons, these concepts are often used interchangeably, and in some organizations one department performs the activities of both. The truth is that both terms have strong interdependence; QA relies mostly on the QC feedback and both work to deliver good quality products/services; but they are different processes. Next table shows the differences between them. QA vs. QC Definition from ASQ.org Assurance: The act of giving confidence, the state of being certain or the act of making certain. QA: The planned and systematic activities implemented in a quality system so that quality requirements for a product or service will be fulfilled. Other definition QA is a failure prevention system that predicts almost everything about product safety, quality standards and legality that could possibly go wrong, and then takes steps to control and prevent flawed products or services from reaching the advanced stages of the supply chain. Definition from ASQ.org Control: An evaluation to indicate needed corrective responses; the act of guiding...

Words: 3803 - Pages: 16

Premium Essay

Quality Assurance in It

...Quality Systems in IT Assignment Implementing Quality Assurance in IT Systems Name: Elise Xuereb Group: 1HND6 Table of Contents Question 1 (P1.1) 2 ISO 9000:2005: ‘Quality Management Systems - Fundamentals and Vocabulary’ 2 ISO 9001:2008: Quality Management Systems - Requirements 3 ISO 19011:2011: Guidelines for auditing management systems 3 Question 7 (D2.1) Take responsibility for managing and organizing quality assurance activities. For 2 quality assurance practices in each stage identified above, you need to do a plan of implementing it. You need to discuss at least 3 people involved and the work operations that need to be done. Criteria: • Correctly write a plan of action for 2 quality assurance practices including 3 people involved and work operations involved. Plan of Action: System Initiation People involved: Project Manager, System Analyst and Quality Assurance Tester. Work that needs to be done and a plan of how it needs to be implemented: 1. Developing a Quality Assurance (QA) Plan: As indicated in Question 6, this step should be implemented by, initially having an exploration phase. In this phase, the client comes up with the procedures that ensures that quality assurance is present in the project. For instance, when having a robust and secure system, the performance of the system should be constant so that no system downtime will take place. Here, one must take into consideration whether the stakeholders have experience with...

Words: 6690 - Pages: 27

Premium Essay

Fun Games

...ES/ER/TM-117/R1 Risk Assessment Program Quality Assurance Plan This document has been approved by the East Tennessee Technology Park Technical Information Office for release to the public. Date: 11/20/97 ES/ER/TM-117/R1 Risk Assessment Program Quality Assurance Plan Date Issued—November 1997 Prepared by Environmental Management and Enrichment Facilities Risk Assessment Program Prepared for the U.S. Department of Energy Office of Environmental Management under budget and reporting code EW 20 LOCKHEED MARTIN ENERGY SYSTEMS, INC. managing the Environmental Management Activities at the East Tennessee Technology Park Oak Ridge Y-12 Plant Oak Ridge National Laboratory Paducah Gaseous Diffusion Plant Portsmouth Gaseous Diffusion Plant under contract DE-AC05-84OR21400 for the U.S. DEPARTMENT OF ENERGY APPROVALS Risk Assessment Program Quality Assurance Plan ES/ER/TM-117/R1 November 1997 [name] Sponsor, U.S. Department of Energy Date [name] U.S. Department of Energy Environmental Management Quality Assurance Program Manager Date [name] Environmental Management and Enrichment Facilities Quality Assurance Specialist Date [name] Environmental Management and Enrichment Facilities Risk Assessment Manager Date [name] Environmental Management and Enrichment Facilities Risk Assessment Program Quality Assurance Specialist Date PREFACE This Quality Assurance Plan (QAP) for the Environmental Management and Enrichment Facilities (EMEF) Risk...

Words: 11450 - Pages: 46

Free Essay

The Case for Software Internal Quality

...THE CASE FOR INTERNAL SOFTWARE QUALITY While this might look like stating the obvious, I still find that there are organisations especially in this part of the world (Nigeria) that have not really imbibed software quality assurance practices. From experience, I find that its because management has not really taken a serious look into the merits and demerits of internal software quality. What is internal software quality and what how does it differ from Just plain Software Quality? When it comes to software quality assurance, it is most commonly viewed in two aspects (There could be more). • External software quality. • Internal software quality. This is derived from Steve McConnell's division of software characteristics in external and nternal characeteristics. External software quality refers to the parts of software that face a user. These refer to: • Functionality • Usability • Reactivity • Security • Availability • Reliability All these aspects of software quality can verified by testing the software as a whole – end-to-end tests. Most often this is the aspect of the software that users and non technical management focus on. Is it doing what the user wants it to do? While this is both ok and required, it is not enough to ascertain the quality of an application. Internal Software Quality refers to the quality of the source code itself. These refer to: • Application Architecture Practices • Coding Practices • Application Complexity • Documentation • Portability • Technical &...

Words: 632 - Pages: 3

Premium Essay

It Question

...Questions: Q1. What do you need to know in regards to determine if a product developed in an IT project achieves its quality objectives? Q2. What is a functionality and portability as the ISO 9126 (Software Quality Characteristics and Attributes) suggests that software quality characteristic may be refined into multiple levels of sub-characteristics.? Q3. What is the basic difference between ISO 10006:2003 and ISO 9004:2008? Q4. What is the result when you differentiate between a computed result and the correct result in quality assurance? Explain quality control methods. Q5. How will you define a mistake in project quality assurance? Q6. Can it be that difficult to measure quality? How would you do it? Q7. Which tools/ techniques that can be used to establish relationships between sequential (check) results from testing or inspection Q8. Explain some of the quality management tools and techniques. Q9. What are the major costs related to quality? Define cost of quality. Q10. Explain the stakeholders/clients roles to determine the effectiveness of the quality management system? Q11. What are the different organisational policies or procedures that may impact on the implementation of a complex project? Q12. Project management is accomplished through the appropriate application and integration of the 42 logically grouped project management processes comprising the five (5) Process Groups. List and outline briefly each of those process groups...

Words: 378 - Pages: 2

Premium Essay

Connections Ii

...address 2 city, state, zip phone & fax email Client's Security Officer (Complete this section if clearances are involved, listing the clearance level needed, otherwise enter N/A.) name address 1 address 2 city, state, zip phone & fax email GSA Customer Service Representative (CSR / COTR) Contact Name Agency Name Address Contact #’s email GSA Contracting Officer Contact Name Agency Name Address Contact #’s email GSA Invoice Address Address GSA Client Acceptance Invoice Address Address 3.0 Introduction, Overview, or Background. This section provides background and descriptions of the Agency's organizational structure, where the services are to be provided, the importance of the software development effort, any previous efforts germane to this effort, and the hardware and software resources in use. This section could also include agency or organization specific information about government furnished items, working hours, federal holidays, and a glossary to define terms used within the body of the work statement. Provide a short description of the requirements without including the specific requirements. Consider the following: How the requirement evolved; relationship to other projects; why work is needed. Summarize information which is essential for understanding the work and ensure technical information is understandable to potential readers of different disciplines. 3.1 Contract Type. State preference for type of contract. 3.2 Place of Performance/Hours of Operation...

Words: 5615 - Pages: 23

Premium Essay

Resume

...PLacerra@gmail.com     Senior Quality Assurance professional expert in the design and delivery of cost-effective, high-performance technology in support of growth with budget responsibilities up to all phases of Quality Assurance, the project life cycle, from initial feasibility analysis and conceptual design through implementation and enhancement.  Effective at building culturally diverse, team-centered operating units, with excellent business process and strategy development skills.  Expert at planning, managing and executing all life cycle activities such as test strategy, defect tracking, testing estimation and test planning.  |   |   |   | SUMMARY OF QUALIFICATIONS   Knowledge & Expertise - 25 plus years of expertise in Software Testing/Quality Assurance Management, Risk Management, Data Warehousing, Defect Management and Resource Management. Process improvement and optimization focus - Improve software quality, meet customer expectations and reduce defects by supporting proven software quality practices as well as implementing practical and effective process improvement methods.   Flexible - Demonstrated ability to excel in process oriented organizations requiring strict adherence to regulations as well as those requiring out of box thinking for new and effective process implementation and adherence.   Self motivated fast learner - Constant focus on enhancing knowledge / expertise by keeping pace with latest developments in software quality, testing and management...

Words: 1785 - Pages: 8

Premium Essay

Serquality

...of its contents. Research Ethics and Governance I/We have read and understood the Lancaster University Research Ethics and Governance code of practice. Note: This form is to be used as the first page for all coursework submissions. Table of Contents 1. Abstract 3 2. Introduction 3 The Company & Programme 3 My Role and Responsibilities 4 Situation and Assignment Objective 4 3. Literature Review 5 Quality for BI software as a product 5 Quality Dimension 6 Quality for BI software as a Service 7 Literate Review Summary 10 4. Methodology 10 Questionnaire 10 Shortcomings of data collection 11 5. Finding and Interpretations 11 Software quality Model Dimensions (Kumar et al 2010) 11 6. Conclusions & Recommendations: 17 Conclusions: 17 Recommendation: 18 7. Critical Reflection 18 References 19 Appendix 20 1. Abstract Purpose – The purpose of this paper is to determine if Quality Controls done in a Business Intelligence(BI) Software/solution development programme are enough to attain quality to meet customer’s expectations. Research...

Words: 6351 - Pages: 26

Premium Essay

Assessment 2 Project Management

...regards to determine if a product developed in an IT project achieves its quality objectives? Q2. What is a functionality and portability as the ISO 9126 (Software Quality Characteristics and Attributes) suggests that software quality characteristic may be refined into multiple levels of sub-characteristics.? Q3. What is the basic difference between ISO 10006:2003 and ISO 9004:2008? Q4. What is the result when you differentiate between a computed result and the correct result in quality assurance? Explain quality control methods. Q5. How will you define a mistake in project quality assurance? Q6. Can it be that difficult to measure quality? How would you do it? Q7. Which tools/ techniques that can be used to establish relationships between sequential (check) results from testing or inspection Q8. Explain some of the quality management tools and techniques. Q9. What are the major costs related to quality? Define cost of quality. Q10. Explain the stakeholders/clients roles to determine the effectiveness of the quality management system? Q11...

Words: 406 - Pages: 2

Free Essay

Web Quality

...Quality Attributes in mobile Web Application Development Axel Spriestersbach 1, Thomas Springer2 1 SAP-AG, Corporate Research, Germany axel.spriestersbach@sap.com 2 Dresden University of Technology, Germany springet@rn.inf.tu-dresden.de Abstract: This paper deals with a quality model for mobile web applications. The paper describes typical challenges in the development of mobile web application and decomposes the challenges into the quality of the ISO 9126 quality standard. This leads to an adjusted ISO model that focuses on those quality features that are important in order to assure the quality of mobile web applications. The proposed model may be used for analyzing the quality factors of mobile web applications, expert evaluation checklists and may be used for quality based content adaptation. Finally, the paper shows that challenges in mobile web application development may be solved by applying quality insurance methods to the development of those applications. Introduction The mobile Internet promised comparable flexibility and cost efficiency to the normal web. However, experiences indicate that the development of mobile web applications needs to consider special challenges in the areas of usability, development efficiency and runtime consideration. The major challenge of mobile application development is the heterogeneity of mobile devices and web browsers installed on the devices. The differences in the form factors and input capabilities strongly influence...

Words: 3672 - Pages: 15