Project 1

Milestone 1

The application I wish to design and implement is a voice transcription app. The inspiration behind this app is that I struggle on a daily basis to have conversations in person as I am profoundly hard of hearing. I constantly need to ask, “what?” or “can you repeat that?” to the point of annoying people. I would like a device that can easily transcribe at the push of a button, what a person is saying in real time to allow the hard of hearing or deaf (audience) person to be able to keep up with the conversation. The goal of the app will be to have the user let the person they have asked to repeat what they are saying to speak into the microphone on their mobile device, and the device will transcribe what they are saying with the push of a button.

There are a few apps that already exist that are similar to what I desire to create.

Voicea - Voicea is a transcribing application that I myself have used in classes before. It is a fairly good application but requires a lot of in-app purchases in order to get access to the real-time transcribing. It is fairly efficient but very costly.

Shazam - Shazam uses similar technology to quickly pick up the music in its surrounding and connect said music to the song the user desires to find. It is using a similar function to access the microphone and allow for it to listen until the song is discovered. I like the aspect of having a very simple main screen with one button for listening and the user presses down on the button until the song is discovered. It is simple, yet intuitive.

Rev - An application where you can upload Voice Memos from your phone from meetings and audio files to get a transcription completed within 24 hours. This is not real time transcription which something I wish to eliminate and make possible with my application. Additionally, it costs per minute for transcriptions which is not desirable for most people wanting to get transcriptions easy, fast, and financially accessible.

Steno - An application that provides live transcribing that is good in the realm of live-audio transcription. The color scheme and such is good but it is very cluttered once the text shows up on the page. Additionally, like many other apps, it is expensive to pay for a certain amount of minutes to be transcribed so this is definitely something that is a downside on most apps. Some of the reviews stated that it was not intuitive and therefore made it very difficult to use when in situations you want to get the audio fast.

There are many other apps that are rated poorly as well for transcribing or only does video or audio uploads and not live transcriptions. This makes it evident that developing this app is doable, not easy, but definitely possible. In order to develop this app, I will need to use a lot of different resources to understand the more difficult functions that have not been discussed yet. The Apple Developer has a lot of information on the transcription tools and the accessing microphone tools.

An example of one of the functions that I will be using for developing this app is SFSpeechRecognitionResult. This will allow the transcription to happen and to produce an accurate result back to the user. Overall, in the iOS developer page under the subsection of Speech, there are countless resources to develop a successful transcribing app. I will be using this resource primarily throughout the development process.

The prototype of what I hope to develop in terms of the user interaction is displayed below on the note cards. The user will presented with a home screen similar to that on Shazam with one button in the center that will act like the record button on your iOS device’s camera application. Then the transcription will begin to show below the button until you press the button again to stop recording. At the end of the transcription, you have the option to edit the text and then save or delete the transcription. I want this app to be intuitive

I will use way finding to allow the user to know where they are on the page with the logos and icons used, as well as always allowing a way backwards or to delete the transcription as their exit route. Feedback will be provided by the text given in the transcription. I will be using the share icon and the trash icon which is what is consistent with what users of iOS devices are familiar with. I will use the 80/20 rule and progressive disclosure to keep the app as intuitive to use as possible and won’t clutter the user with unnecessary and confusing information. Additionally, the use of the circle on the home screen will allow for a good sense of symmetry for the user.


Milestone 2

Before approaching the development phase of the app, we had a feedback session in class and from our grading on the initial documentation. Upon that feedback, many people specified that they liked the idea of the sharrow and trash buttons, however, it is common for those buttons to be shown on the bottom of the view. This allows for the consistency principle to be evident in the app so users intuitively know what those buttons are meant to do.

The documentation for the Speech API is thorough but still complicated to understand. I was able to find many examples using the Speech documentation to understand how it is laid out. I was then able to piece together what is absolutely necessary from what is on the documentation from Apple Developer site and what was extra that the individual added. However, as I started to piece together what I thought was necessary, I found out the hard way that I was very much wrong. I had to reanalyze every line of code to understand how the Speech API works. I was able to start to get a better idea. I found a developers code on GitHub for a very basic development of the Apple Developer description and worked with manipulating that. One thing that I was not aware of, was that it is a requirement if you use the Speech API that you request permission from the user to access their microphone. This is a legality thing as well as ensuring that the user knows what is happening and why the microphone is necessary to make the application function. I understand why this is necessary, just didn’t think about the fact that I would need to code it myself. I tried another variation of an app that took spoken words for flight tracking and found the associated flight with what the user said. This app used an additional controller Swift file that confused me and didn’t make any sense, so I decided to delete that all together from the implementation for my own project. The app that seemed to work the best and seemed simplest for me to understand how to implement for my purposes was the Apple Developer app. The finished version of the Apple developer app looked like the image below:

 
appledev.png
 

Of course the aesthetic of what is shown above is very different from how I wish to approach my project, but is the hardest part of my project solved. I ran it on my computer to ensure that I understood how it should work so I could apply it to my own application. When applying it to my own application, I kept having to re-link the button and text field which resulted in many connections that I thought I had deleted, however, finally when I looked at the connections tab for each I saw that there were multiple connections for the same button and text field which was not when I intended. When I fixed that, the application ran nearly perfectly. The next issue I needed to address was the button changing color when it was pressed and recording versus when it was done recording. I was able to fix this with trial and error but it did take a little bit of time! The challenge I experienced after getting the keyboard to show up was making it disappear again when you press outside of the text box and making the text area scrollable. We have already seen this in class but I wanted to make it possible with my application. The keyboard now is able to toggle based off of whether the text field is selected or not. The piece that still needs to be worked on is the scrolling. It works but the view is way off the screen and that needs to be changed. The result of my application is shown below:

 
 

The next challenge I want to tackle next is how to make the toolbar with the sharrow and the trash button work. However, working on getting this to work was daunting enough as is.

GitHub with Rough Draft


Milestone 4

I am very proud of this app. Of course, during show-and-tell time for this app during class, the HDMI made the application crash. The combination of learning a new library on top of what we had already learned in class was a lot of fun to do on my own time. I think one of the biggest things that I needed to keep in mind was the fact that sometimes during auto layout, connections can be deleted. When I was working with my app, I had somehow managed to delete a toolbar item connection and it ended up making the application not work as I had hoped. In the end, the app worked and was adaptive on all devices and I am very proud of what I was able to accomplish. In terms of aesthetics, I did develop my own logo for this app, but the app still felt fairly bland in appearance to me. It maybe could have used additional colors to make it more enjoyable for the user. Additionally, some users did not find it extremely intuitive to press the logo button at the top of the screen (as some users are not familiar with Shazam, the app I got that idea from). This is something to definitely keep in mind for future apps I choose to develop. Even though the mind map may seem intuitive to most, it is not to everyone, so design based off of that level of accessibility as well. I found it profoundly difficult to get the navigation bar working which ended up pushing my project almost to the last minute before submission. I think that if I had taken time to research this as well during the process of milestone 2, this could have gone a lot more smoothly. Otherwise, I thoroughly enjoyed working on the development of my hopes for this app becoming a reality!