January 16, 2016December 23, 2016 by anthonysmoak

The Benefits of Service Oriented Architecture for Financial Services Organizations

The banking and financial industry is an industry where legacy systems are prevalent. Banking systems tend to skew older and are very heterogeneous in nature. This heterogeneity of legacy banking systems is also coupled with the fact that replacement and integration of these systems is a difficult undertaking.

Mazursky (as cited in Baskerville, Cavallari, Hjort-Madsen, Pries-Heje, Sorrentino & Virili, 2010) states that older architectures complicate integration of enterprise applications because the underlying elements have created ‘closed’ architectures. Closed architectures restrict access to vital software and hardware configurations and force organizations to rely on single-vendor solutions for parts of their information computer technology. Thus, closed architectures hinder a bank’s ability to innovate and roll out new integrated financial products.

The flexibility of SOA facilitates easier connectivity to legacy backend systems from outside sources. Because of the size and complexity of most enterprise banking systems, a potential reengineering of these systems to flawlessly accommodate interaction with newer systems is not economically feasible. Inflexible systems can be difficult to modernize and maintain on an on-going basis. Furthermore, banks’ legacy systems can suffer from a lack of interoperability, which can stifle innovation.

“According to a survey commissioned by Infosys and Ovum in May 2012, approximately three quarters of European banks are using outdated core legacy systems. What’s more, 80% of respondents see these outdated systems as barriers to bringing new products to market, whilst 75% say such systems hinder, rather than enable, process change. Integration cost is effectively a barrier to service provision flexibility and subsequent innovation” (Banking Industry Architecture Network, 2012).

The greater the number of banking systems that can easily interact, connect and share data with other systems and services, the more innovative banks can become with respect to their product offerings for consumers. Market demands can be more easily responded to when new product creation is allowed to flourish with appreciably lower system integration costs. New applications can also be developed in much shorter time frames in order to react to customer demand as long as services are maintained and shared across the enterprise.

According to Earley & Free (2002), a bank that identifies a new business opportunity such as wealth management but has never provided such a service in the past can quickly ramp up functionality by utilizing services previously designed for other applications. The new wealth management application is designed without duplicating functionality in a more cost efficient and rapid manner. Smaller and mid-tier banks can capitalize on bringing new applications and services to market quickly and economically. This responsiveness allows smaller banks to offer similar products as those offered by their larger competitors. The modularity of SOA design precludes the need for substantial re-engineering of smaller/mid-tier banks’ information computer technology. This is an important benefit because smaller banks do not have comparable financial resources with respect to the more sizable industry players.

Additionally, SOA has the potential to free up development resources sooner, enabling those resources to engage in other development initiatives. “Few banks will be able to buy or build new capabilities quickly enough to satisfy market requirements without an SOA” (Earley & Free, 2002). In the survey conducted by Infosys Consulting and Ovum (a London based research organization), 100% of banks responded that SOA would become the “dominant, accepted future banking architecture“ (Banking Industry Architecture Network, 2012).

Furthermore, banks can employ multiple services to complete a business process. One service that determines the value of a portfolio of financial assets at market rates (mark to market calculations) could be coupled with another service that calculates the Value at Risk (VaR) of the bank’s entire portfolio. In a similar fashion to the new wealth management application example previously cited, the dual components could be made available to many other enterprise wide financial services and applications that require portfolio valuation and risk related information. In this manner, the functionalities are not inefficiently repeated across every application where they are requested.

Additionally, via use of SOA, “New technology provides the ability to mix and match outsourced business processes with on-premise assets and services” (Essvale Corporation Limited, 2011). Software designed or in use by third party vendors can become more easily integrated with bank systems and processes due to the high connectivity of an SOA approach. According to Gartner research, smaller and mid-tier banks are adopting SOA in order to make the most of their limited IT budgets and resources. “Until now, midtier banks had to rely on customized software packages from a single vendor, and assumed all of the maintenance costs and function limitations inherent in a single, proprietary set of solutions” (Earley et al., 2005). Due to a rising interest in SOA, technology vendors that serve the financial services industry are increasingly working with integration providers to offer a standard set of component integration (Earley, et al., 2005).

One of the benefits of SOA standardization is the enablement of more functionality, performed by much less underlying code. This leads to less complex, more cost effective system maintenance; thereby reducing operational risks.

“A fully implemented SOA provides a bank with a highly predictable application environment that reduces risk in day-to-day operations, due to the minimization and isolation of change to the production systems. Banks that fail to take this approach must constantly change their interfaces as external and internal requirements change. This introduces significant risk and the need for near-continuous testing to ensure that the customer ‘touchpoints’ and the back-end processes do not fail, while ensuring that one data or service change doesn’t adversely affect other data changes integrated through an interface” (Earley et al., 2005).

Conclusion

SOA has an important role to play in the architectural repertoire of banking and financial organizations. Its loosely coupled design characteristic allows services to be shared and reused across the enterprise without disparate systems concerning themselves with the underlying development code. Multiple services can be combined together to form reusable chunks of a business process. Outside sources can connect to legacy backend systems via an API, which increases the opportunity to mix and match vendor capabilities with in-house assets. SOA also helps banks and financial firms ramp up new applications and functionality quickly and economically, increasing product responsiveness to market demands. When SOA is combined with Event Driven Architecture, dynamic event driven systems can be developed that do not rely solely on the less proactive request/reply paradigm.

Banks and financial companies need to remain innovative, cost effective and anticipate customer needs in order to remain profitable. SOA allows organizations to become more agile and flexible with their application development. The rise of applications on mobile cloud enabled platforms means that customers will need to connect to data wherever it dwells. “Bank delivery is focused on reactively providing products on customer request, and mass-market, one-size-fits-all products (for mainstream retail banking). However, it is no longer feasible to force-fit mass-market bank products when technology devices, context and location are key elements to the type of customized bank service a customer needs” (Moyer, 2012). As SOA continues to mature with cloud enabled solutions and the rise of mobile computing, it is primed to be the building block for the next generation of banking application functionality.

References:

Baskerville, R., Cavallari, M., Hjort-Madsen, K., Pries-Heje, J., Sorrentino, M., & Virili, F. 2010. The strategic value of SOA: a comparative case study in the banking sector. International Journal of Information Technology and Management, Vol. 9, No. 1, 2010

Banking Industry Architecture Network. (2012). SOA, standards and IT systems: how will SOA impact the future of banking services? Available from https://bian.org/wp-content/uploads/2012/10/BIAN-SOA-report-2012.pdf

Early, A., & Free, D. (2002, September 4). SOA: A ‘Must Have’ for Core Banking (ID: SPA-17-9683). Retrieved from Gartner database.

Early, A., & Free, D., & Kun, M. (2005, July 1). An SOA Approach Will Boost a Bank’s Competitiveness (ID: G00126447). Retrieved from Gartner database.

Essvale Corporation Limited. (2011). Business knowledge for it in global retail banking: a complete handbook for it professionals.

January 16, 2016December 11, 2016 by anthonysmoak

Overview of Service Oriented Architecture

Service Oriented Architecture (SOA) can be described as an architectural style or strategy of “building loosely coupled distributed systems that deliver application functionality in the form of services for end-user applications” (Ho, 2003). Ho (2003) proclaims that a service can be envisioned as a simple unit of work as offered by a service provider. The service produces a desired end result for the service consumer. Another way to envision the concept of a service is to imagine a “reusable chunk of a business process that can be mixed and matched with other services” (Allen, 2006). The services either communicate with each other (i.e. pass data back and forth) or work in unison to enable or coordinate an activity.

When SOA is employed for designing applications and/or IT systems, the component services can be reused across the enterprise, which helps to lower overall development costs amongst other benefits. Reuse fosters consistency across the enterprise. For example, SOA enables banks to meet the needs of small, but profitable market segments without the need to redevelop new intelligence for a broad set of applications (Earley, Free & Kun, 2005). Furthermore, any number of services can be combined together to mimic a business processes.

“One of the most important advantages of a SOA is the ability to get away from an isolationist practice in software development, where each department builds its own system without any knowledge of what has already been done by others in the organization. This ‘silo’ approach leads to inefficient and costly situations where the same functionality is developed, deployed and maintained multiple times” (Maréchaux, 2006).

Architectural Model

Services are only accessed through a published application-programming interface, better known as the API. The API, which acts as the representative of the service to other applications, services or objects is “loosely coupled” with its underlying development and execution code. Any outside client invoking the service is not concerned with the service’s development code and is hidden from the outside client. “This abstraction of service implementation details through interfaces insulates clients from having to change their own code whenever any changes occur in the service implementation” (Khanna, 2008). In this manner, the service acts as a “black box” where the inner workings and designs of the service are completely independent from requestors. If the underlying code of the service were switched from java to C++, this change would be completely oblivious to would-be requestors of the service.

Allen, (2006) describes the concept of loose coupling as, “a feature of software systems that allows those systems to be linked without having knowledge of the technologies used by one another.” Loosely coupled software can be configured and combined together with other software at runtime. Tightly coupled software does not offer the same integration flexibility with other software, as its configuration is determined at design-time. This design-time configuration significantly hinders reusability options. In addition, loosely coupled applications are much more adaptable to unforeseen changes that may occur in business environments.

In the early 1990’s some financial firms adopted an objected oriented approach to their banking architecture. This approach is only superficially similar to a service oriented architecture approach. In an object oriented (OO) approach, the emphasis is on the ability to reuse objects within the source code. SOA emphasizes a “runtime reuse” philosophy in which the service itself is discoverable and reused across a network (Earley, Free & Kun, 2005). SOA also provides a solution to the lack of interoperability between legacy systems.

References:

Allen, P. (2006). Service orientation: winning strategies and best practices.

Early, A., & Free, D., & Kun, M. (2005, July 1). An SOA Approach Will Boost a Bank’s Competitiveness (ID: G00126447). Retrieved from Gartner database.

Ho, H. (2003). What is Service-Oriented Architecture? O’Reilly XML.com.

Khanna, Ayesha. (2008). Straight through processing for financial services: the complete guide.

Maréchaux, J., (2006, March 28). Combining Service-Oriented Architecture and Event-Driven Architecture using an Enterprise Service Bus. IBM developerWorks. Retrieved from http://www.ibm.com/developerworks/library/ws-soa-eda-esb/

January 3, 2016February 22, 2016 by anthonysmoak

Spear Phishing

Regarding this New York Time article: Hackers in China Attacked The Times for Last 4 Months

Spear phishing attacks against businesses, diplomatic and government agencies seem to be very popular with cyber espionage networks. You only need one person to take the wrong action and the entire system is compromised as the New York Times is discovering.

China in 2012 used spear phishing and a .pdf file that exploited a vulnerability in Windows to launch spear phishing attacks against Tibetan activist groups. Antivirus software did not widely recognize the threats as was the case with the NYT’s imbroglio. [1]

In a similar vein to the attacks on the NYT, targeted spear phishing was used in a very recent incident called Operation Red October (lending to the fact that the attacks emanated from a Russophone country). The malware produced from this attack is called ‘Rocra’ and it is aimed at governments and research institutions in former Soviet republics and Eastern Europe.

The New York Times article states “Once they take a liking to a victim, they tend to come back. It’s not like a digital crime case where the intruders steal stuff and then they’re gone. This requires an internal vigilance model.”

It’s intriguing that the Red October attacks embody the spirit of that quote in the design of its malware:

“Red October also has a “resurrection” module embedded as a plug-in in Adobe Reader and Microsoft Office applications. This module made it possible for attackers to regain control of a system even after the malware itself was discovered and removed from the system.”

This is pretty scary stuff but ingenious nonetheless. Organizations need to take heed and make sure they are doing absolutely everything they can to combat attacks and training users about the dangers of spear phishing.

[1] http://www.scmagazineuk.com/chinese-spears-attack-tibetan-activists/article/231923/

[2] http://www.wnd.com/2013/01/red-october-cyberattack-implodes/

December 22, 2015December 23, 2015 by anthonysmoak

What Exactly is Hadoop & MapReduce?

Video
Data Science
Data Science, Hadoop, iSchool, Map Reduce, Video
Leave a comment

In a nutshell, Hadoop is an open source framework that enables the distributed processing of large amounts of data over multiple servers. In effect it is a distributed file system tailored to the storage needs of big data analysis. In lieu of holding all of the data required on one big expensive machine, Hadoop offers a scalable solution of incorporating more drives and data sources as the need arises.

Having the storage capacity for big data analyses in place is instrumental, but equally important is having the means to process data from the distributed sources. This is where Map Reduce comes into play.

Map Reduce is a programming model introduced by Google for processing and generating large data sets on clusters of computers. This video from IBM Analytics does an excellent job of presenting a clear concise description of what Map Reduce accomplishes.

December 22, 2015December 22, 2015 by anthonysmoak

Enterprise Architecture and Best Practices

Enterprise Architecture (EA) is the overarching framework that relies on a combination of artifacts in each domain area to paint a picture of the organization’s current (and future) capabilities. Best practices are proven methodologies for implementing portions of the architecture.

Both “Best Practices” and “Artifacts” are two of the six core elements that are necessary for an EA approach to be considered complete. A best practice can be used as an artifact within the EA leading to complementary effects. For example, best practices such as SWOT Analysis and Balanced Scorecard can be used as artifacts within EA at the strategic level. The Knowledge Management Plan, which outlines how data and information is shared across the enterprise, is an artifact that resides at the Data & Information functional area on the EA cubed framework.

As EA is the integration of strategy, business and technology, organizational strategic dictates can greatly influence the use or adoption of a best practice. If rapid application development, reusability or speed to market are important to the business, then the best practice of Service-Oriented Architecture (SOA) can be used within the EA to meet this end.

This example from T-Mobile highlights an example of SOA within an EA framework enabling quick development and reusability:

“For example, one T-Mobile project was to create a service that would let subscribers customize their ring sounds. The project group assumed that it would have to create most of the software from scratch. But the architecture group found code that had already been written elsewhere at T-Mobile that could be used as the basis of the development effort. The reuse reduced the development cycle time by a factor of two, says Moiin. [vice president of technical strategy for T-Mobile International]” [2].

Another example of EA driving a best practice selection is evident at Verizon. The enterprise architects at Verizon have incorporated an SOA best practice into their EA framework with the aim of building a large repository of web services that will help cut development times.

“‘We had to give incentives to developers to develop the Web services,’ says Kheradpir. ‘When they say they want to develop a certain functionality, we say: ‘You can do that, but you have to build a Web service also.’’

Though the developers grumbled at first, they eventually saw that everybody wins with a rapidly expanding repository of services. ‘This is going to be our strategy for enterprise architecture,’ says Kheradpir. ‘Now developers can go to the Web site and grab a service and start coding the higher-level application they need, rather than spending so much time at the integration and infrastructure level’”[2].

[1] Bernard, Scott A. “Linking Strategy, Business and Technology. EA3 An Introduction to Enterprise Architecture.” Bloomington, IN: Author House, 2012, Third Edition

[2] Koch, Christopher. “A New Blueprint for the Enterprise”, April 8^th 2005. Retrieved from web: http://www.cio.com.au/article/30960/new_blueprint_enterprise/?pp=4

December 22, 2015April 12, 2021 by anthonysmoak

Normalization: A Database Best Practice

The practice of normalization is widely regarded as the standard methodology for logically organizing data to reduce anomalies in database management systems. Normalization involves deconstructing information into various sub-parts that are linked together in a logical way. Malaika and Nicola (2011) state, “..data normalization represents business records in computers by deconstructing the record into many parts, sometimes hundreds of parts, and reconstructing them again as necessary. Artificial keys and associated indexes are required to link the parts of a single record together.“ Although there are successively stringent forms of normalization, best practice involves decomposing information into the 3^rd normal form (3NF). Subsequent higher normal forms provide protection from anomalies that most practitioners will rarely ever encounter.

Background

The normalization methodology was the brainchild of mathematician and IBM researcher Dr. Edgar Frank Codd. Dr. Codd developed the technique while working at IBM’s San Jose Research Laboratory in 1970 (IBM Archives, 2003). Early databases employed either inflexible hierarchical designs or a collection of pointers to data on magnetic tapes. “While such databases could be efficient in handling the specific data and queries they were designed for, they were absolutely inflexible. New types of queries required complex reprogramming, and adding new types of data forced a total redesign of the database itself.” (IBM Archives, 2003). In addition, disk space in the early days of computing was limited and highly expensive. Dr. Codd’s seminal paper “A Relational Model of Data for Large Shared Data Banks” proposed a flexible structure of rows and columns that would help reduce the amount of disk space necessary to store information. Furthermore, this revolutionary new methodology provided the benefit of significantly reducing data anomalies. These aforementioned benefits are achieved by ensuring that data is stored on disk exactly once.

Normal Forms (1NF to 3NF):

Normalization is widely regarded as the best practice when developing a coherent flexible database structure. Adams & Beckett (1997) state that designing a normalized database structure should be the first step taken when building a database that is meant to last. There are seven different forms of normalization; each lower form is a subset of the next higher form. Thus a database in 2^nd normal form (2NF) is also in 1^st normal form (1NF), although with additional satisfying conditions. Normalization best practice holds that databases in 3^rd normal form (3NF) should suffice for the widest range of solutions. Adams & Beckett (1997) called 3NF “adequate for most practical needs.” When Dr. Codd initially proposed the concept of normalization, 3NF was the highest form introduced (Oppel, 2011).

A database table has achieved 1NF if does not contain any repeating groups and its attributes cannot be decomposed into smaller portions (atomicity). Most importantly, all of the data must relate to a primary key that uniquely indentifies a respective row. “When you have more than one field storing the same kind of information in a single table, you have a repeating group.” (Adams & Beckett, 1997). A higher level of normalization is often needed for tables in 1NF. Tables in 1NF are often subjected to “data duplication, update performance degradation, and update integrity problems..” (Teorey, Lightstone, Nadeau & Jagadish, 2011).

A database table has achieved 2NF if it meets the conditions of 1NF and if all of the non-key fields depend on ALL of the key fields (Stephens, 2009). It is important to note that tables with only 1 primary key that satisfy 1NF conditions are automatically in 2NF. In essence, 2NF helps data modelers determine if 2 tables have been combined into one table.

A database has achieved 3NF if it meets the conditions of 2NF and it contains no transitive dependencies. “A transitive dependency is when one non‐key field’s value depends on another non‐key field’s value” (Stephens, 2009). If any of the fields in the database table are dependent on any other fields, then the dependent field should be placed into another table.

If for example field B is functionally dependent on field A, (e.g. A->B), then add field A and B to a new table, with field A designated as a key which will provide linkage to the original table.

In short, 2NF and 3NF help determine the relationship between key and non-key attributes. Williams (1983) states, “Under second and third normal forms, a non-key field must provide a fact about the key, just the whole key, and nothing but the key.” A variant of this definition is typically supplemented with the remark “so help me Codd”.

Benefits:

Adams & Beckett (1997) assert that the normalization method provides benefits such as database efficiency & flexibility, the avoidance of redundant fields, and an easier to maintain database structure. Hoberman, (2009) adds that normalization provides a modeler with a stronger understanding of the business. The normalization process ensures that many questions are asked regarding data elements so these elements may be assigned to entities correctly. Hoberman also agrees that data quality is improved as redundancy is reduced.
Assessment

Although engaging in normalization is considered best practice, many sources advocate that normalization to 3NF is sufficient for the majority of data modeling solutions. Third normal form is deemed sufficient because the anomalies covered in higher forms occur with much less frequency. Beckett & Adams (1997) describe 3NF as “adequate for most practical needs.” Stephens (2009) affirms that many designers view 3NF as the form that combines adequate protection from recurrent data anomalies with relative modeling ease. Levels of normalization beyond 3NF can yield data models that are overly engineered, overly complicated and hard to maintain. The risk inherent in higher form constructs is that the performance can degrade to a level that is worse than less normalized designs. Hoberman (2009) asserts that, “Even though there are higher levels of normalization than 3NF, many interpret the term ‘normalized’ to mean 3NF.”

There are examples in data modeling literature where strict adherence to normalization is not advised. Fotache (2006) posited that normalization was initially a highly rigorous, theoretical methodology that was of not much practical use to real world development. Fotache provides the example of an attribute named ADDRESS, which is typically stored by many companies as an atomic string per 1NF requirements. The ADDRESS data could be stored in one field (violating 1NF) if the data is only needed for better person identification or mailing purposes. Teorey, Lightstone, Nadeau, & Jagadish (2011) advise that denormalization should be considered when performance considerations are in play. Denormalization introduces a trade off of increased update cost versus lower read cost depending upon the levels of data redundancy. Date (1990) downplays strict adherence to normalization and sets a minimum requirement of 1NF. “Normalization theory is a useful aid in the process, but it is not a panacea; anyone designing a database is certainly advised to be familiar with the basic techniques of normalization…but we do not mean to suggest that the design should necessarily be based on normalization principles alone” (Date, 1990).
Conclusion

Normalization is the best practice when designing a flexible and efficient database structure. The first three normal forms can be remembered by recalling a simple mnemonic. All attributes should depend upon a key (1NF), the whole key (2NF) and nothing but the key (3NF).

The advantages of normalization are many. Normalization ensures that modelers have a strong understanding of the business, it greatly reduces data redundancies and it improves data quality. When there is less data to store on disk, updating and inserting becomes a faster process. In addition, insert, delete and update anomalies disappear when adhering to normalization techniques. “The mantra of the skilled database designer is, for each attribute, capture it once, store it once, and use that one copy everywhere” (Stephens 2009).

It is important to remember that normalization to 3NF is sufficient for the majority of data modeling solutions. Higher levels of normalization can overcomplicate a database design and have the potential to provide worse performance.

In conclusion, begin the database design process by using normalization techniques. For implementation purposes, normalize data to 3NF compliance and then consider if data retrieval performance reasons necessitate denormalizing to a lower form. Denormalization introduces a trade off of increased update cost versus lower read cost depending upon the levels of data redundancy.

From the DAMA Guide to the Data Management Body of Knowledge:

First normal form (1NF): Ensures each entity has a valid primary key , every data element depends on the primary key, and removes repeating groups , and ensuring each data element is atomic (not multi-valued).
Second normal form (2NF): Ensures each entity has the minimal primary key and that every data element depends on the complete primary key.
Third normal form (3NF): Ensures each entity has no hidden primary keys and that each data element depends on no data element outside the key (―the key, the whole key and nothing but the key).

Glossary:

Delete Anomaly: “A delete anomaly is a situation where a deletion of data about one particular entity causes unintended loss of data that characterizes another entity.” (Stephens, 2009)

Denormalization: Denormalization involves reversing the process of normalization to gain faster read performance.

Insert Anomaly: “An insert anomaly is a situation where you cannot insert a new tuple into a relation because of an artificial dependency on another relation.” (Stephens, 2009)

Normalization: “Normalization is the process of organizing data in a database. This includes creating tables and establishing relationships between those tables according to rules designed both to protect the data and to make the database more flexible by eliminating redundancy and inconsistent dependency.” (Microsoft Knowledge Base)

Primary Key: “Even though an entity may contain more than one candidate key, we can only select one candidate key to be the primary key for an entity. A primary key is a candidate key that has been chosen to be the unique identifier for an entity.” (Hoberman, 2009)

Update Anomaly: “An update anomaly is a situation where an update of a single data value requires multiple tuples (rows) of data to be updated.” (Stephens, 2009)

Bibliography

Adams, D., & Beckett, D. (1997). Normalization Is a Nice Theory. Foresight Technology Inc. Retrieved from http://www.4dcompanion.com/downloads/papers/normalization.pdf

DAMA International (2009). The DAMA Guide to the Data Management Body of Knowledge (1st Edition).

Fotache, M. (2006, May 1) Why Normalization Failed to Become the Ultimate Guide for Database Designers? Available at SSRN: http://ssrn.com/abstract=905060 or http://dx.doi.org/10.2139/ssrn.905060

Hoberman, S. (2009). Data modeling made simple: a practical guide for business and it professionals, second edition. [Books24x7 version] Available from http://common.books24x7.com.libezproxy2.syr.edu/toc.aspx?bookid=34408.

IBM Archives (2003): Edgar F. Codd. Retrieved from http://www-03.ibm.com/ibm/history/exhibits/builders/builders_codd.html

Kent, W. (1983) A Simple Guide to Five Normal Forms in Relational Database Theory. Communications of the ACM 26(2). Retrieved from http://www.bkent.net/Doc/simple5.htm

Malaika, S., & Nicola, M. (2011, December 15). Data normalization reconsidered, Part 1: The history of business records. Retrieved from http://www.ibm.com/developerworks/data/library/techarticle/dm-1112normalization/

Microsoft Knowledge Base. Article ID: 283878. Description of the database normalization basics. Retrieved from http://support.microsoft.com/kb/283878

Oppel, A. (2011). Databases demystified, 2nd edition. [Books24x7 version] Available from http://common.books24x7.com.libezproxy2.syr.edu/toc.aspx?bookid=72521.

Stephens, Rod. (2009). Beginning database design solutions. [Books24x7 version] Available from http://common.books24x7.com.libezproxy2.syr.edu/toc.aspx?bookid=29584.

Teorey, T. & Lightstone, S. & Nadeau, T. & Jagadish, H.V.. ( © 2011). Database modeling and design: logical design, fifth edition. [Books24x7 version] Available from http://common.books24x7.com.libezproxy2.syr.edu/toc.aspx?bookid=41847.

Image courtesy of David Castillo Dominici at FreeDigitalPhotos.net

December 22, 2015December 22, 2015 by anthonysmoak

Interpret Big Data with Caution

One caution with respect to employing big data (or any other data reliant technique) is the tendency of practitioners to have an overconfidence in understanding the inputs and interpreting the outputs. It sounds like a fundamental concept but if one does not have a strong understanding of what the incoming data signifies, then the interpreted output is highly likely to be biased. As is the case with the concept of sampling, if the sample is not representative of the larger whole then bias will occur. Example:

“Consider Boston’s Street Bump smartphone app, which uses a phone’s accelerometer to detect potholes without the need for city workers to patrol the streets. As citizens of Boston download the app and drive around, their phones automatically notify City Hall of the need to repair the road surface.” [1]

One would be tempted to conclude that the data that feeds into the app would reasonably represent all of the potholes in the city. In actuality, the data that was fed into the app represented those potholes in areas inhabited by young, affluent smartphone owners. The city runs the risk of neglecting areas where older, less affluent, non smartphone owners experience potholes; which is a significant portion of the city.

“As we move into an era in which personal devices are seen as proxies for public needs, we run the risk that already existing inequities will be further entrenched. Thus, with every big data set, we need to ask which people are excluded. Which places are less visible? What happens if you live in the shadow of big data sets?” [2]

[1] http://www.ft.com/intl/cms/s/2/21a6e7d8-b479-11e3-a09a-00144feabdc0.html

[2] https://hbr.org/2013/04/the-hidden-biases-in-big-data/

December 22, 2015April 9, 2017 by anthonysmoak

The Importance of Standards in Large Complex Organizations

Large complex organizations require standards with respect to developing strategic goals, business processes, and technology solutions because agreed upon guiding principles support organizational efficiencies. Without standards in these spaces, there is the increased potential for duplication of functionality, as localized business units implement processes and technologies with disregard for the enterprise as a whole. When the enterprise as a whole considers items such as applications, tools and vendors, standards help ensure seamless integration.

Examples of enterprise standards might be:

“The acquisition of purchased applications is preferred to the development of custom applications.” [1]
An example of an infrastructure-driven principle might be, “The technology architecture must strive to reduce application integration complexity” [1]
“Open industry standards are preferred to proprietary standards.” [1]

These hypothetical top down standards can help settle the “battle of best practices” [2]. Standards also provide direction and can guide the line of business staff’s decision-making so that the entire organization is aligned to strategic goals. Furthermore, the minimization of diversity in technology solutions typically lowers complexity, which in turn helps to lower associated costs.

Enterprise Architecture standards also have a place in facilitating “post merger/acquisition streamlining and new product/service rollouts” [2]. Successfully and rapidly integrating new acquisitions onto a common framework can be vital to success.

Here are two banking examples where post merger system integration problems arose in financial services companies:

In 1996, when Wells Fargo & Co. bought First Interstate Bancorp, thousands of customers left because of missing records, long lines at branches, and administrative snarls. In 1997, Wells Fargo announced a $150 million write-off to cover lost deposits due to its faulty IT integration. [3]

In 1998, First Union Corp. and CoreStates Financial Corp. merged to form one of the largest U.S. banks. In 1999, First Union saw its stock price tumble on news of lower-than-expected earnings resulting from customer attrition. The problems arose from First Union’s ill-fated attempt at rapidly moving customers to a new branch-banking system. [3]

Having robust Enterprise Architecture standards in place may have helped to reduce the risk of failure when integrating these dissimilar entities.

References:

[1] Fournier, R., “Build for Business Innovation – Flexible, Standardized Enterprise Architectures Will Produce Several IT Benefits.” Information Week, November 1, 1999. Retrieved from Factiva database.

[2] Bernard, Scott A. “Linking Strategy, Business and Technology. EA3 An Introduction to Enterprise Architecture.” Bloomington, IN: Author House, 2012, Third Edition

[3] Popovich, Steven G., “Meeting the Pressures to Accelerate IT Integration”, Mergers & Acquisitions: The Dealmakers Journal, December 1, 2001. Retrieved from Factiva database.

Smoak Signals | Data Analytics Blog

Sharing Knowledge: Data, Visualization & Business

Tag / iSchool

The Benefits of Service Oriented Architecture for Financial Services Organizations

Overview of Service Oriented Architecture

Spear Phishing

What Exactly is Hadoop & MapReduce?

Enterprise Architecture and Best Practices

Normalization: A Database Best Practice

Interpret Big Data with Caution

The Importance of Standards in Large Complex Organizations